Versioned Backup of a Shared Web Hosting Account Using Rsync and Rdiff-Backup

Published: 
Updated: 

My wife and I have a number of WordPress sites (including this one) hosted by DreamHost's inexpensive shared web hosting service. Two of those sites are newly launched this month and could be vulnerable to all kinds of operator errors due to changing wishes and requirements.

Clearly, I need a daily backup of both sites. I don't want to rely on the vendor's backup and technical support procedures if my wife and I break something.

This tutorial will show you how to deploy a versioned backup solution for web sites running in a shared web hosting environment such as DreamHost.

Update (11 July 2016): The instructions on this page show you how to use rdiff-backup without having it installed on the server that has the data you want backed up. Today I created a script to help you install rdiff-backup on the server (if it's running Ubuntu) even if you don't have root access.

Requirements

At Hosting Provider -- Server-side

In the hosting environment at your vendor, you need:

On Your Personal Server -- Client-side

You need another machine on the Internet to perform the backup. It could be in your home or your office.

Overview

rdiff-backup performs incremental backups of one folder into another, keeping a full copy of the most recent snapshot, and incremental history working back from the full backup. Only changes from the current version are stored, and these diffs are stored efficiently to remove any redundant blocks of data.

rdiff-backup is available for most operating systems, but it typically doesn't come already installed on a shared web hosting account; DreamHost is no exception. My solution runs rdiff-backup on my personal server that does the backup work, but it needs a fresh copy of the data to work on, so we use rsync running on the client and the server over ssh to make a mirror copy.

Step 1: Create SSH Config

The scripts later on in this article are more readable if you put ssh parameters into a config file ahead of time. Here's what my config file looks like on the client (the personal server):

config

Host dh-SITE1-backup
HostName DOMAIN1.TLD
User USER1

Host dh-SITE2-backup
HostName DOMAIN2.TLD
User USER2

# etc.

Now, when your script makes an ssh connection to dh-SITE1-backup it doesn't have to specify the hostname, username, and any other parameters right there in the ssh command line invokation.

Step 2: Create SSH Key for Passwordless Login

In order to run rsync over ssh without being prompted for a password, you need to create a public/private key pair and install the public key on the server, in each account on the server where there is a site you're backing up.

First, on the client (your personal server), run this command if you haven't already created a key pair:

client$ ssh-keygen

Accept all the defaults. Don't set a password --- or else your backup script won't be able to run unattended.

Now copy the public key to all the shell accounts on servers you are backing up:

client$ ssh-copy-id dh-SITE1-backup
USER1@DOMAIN1.TLD's password:
Now try logging into the machine, with "ssh 'dh-SITE1-backup'", and check in:

  ~/.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

client$ ssh-copy-id dh-SITE2-backup
USER2@DOMAIN2.TLD's password:
Now try logging into the machine, with "ssh 'dh-SITE2-backup'", and check in:

  ~/.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

Step 3: Create MySQL Backup Script on Server

client$ ssh dh-SITE1-backup
server$ mkdir ~/db_backup
server$ touch ~/db_backup/db-dump.sh
server$ chmod +x ~/db_backup/db-dump.sh
server$ nano ~/db_backup/db-dump.sh #Edit the script in nano -- CTRL-X to save and quit

~/db_backup/db-dump.sh

#!/bin/bash

SCRIPT_PATH=`dirname $0`
DUMP_PATH=`readlink -f $SCRIPT_PATH`/database.sql

mysqldump --host=MYSQL_HOSTNAME --user=MYSQL_USERNAME --password="MYSQL_PASSWORD" MYSQL_DB_NAME >"$DUMP_PATH"

Make sure you fill in the all-caps meta variables above starting with MYSQL_.

If you are using a database engine other than MySQL, lookup how to dump the database to a file and change the script accordingly.

Run ~/db_backup/db-dump.sh on the server and make sure you get an SQL file in that folder.

Repeat this entire step for any additional sites you have if you are backing up more than one site.

Step 4: Create the Backup Script

On the client, create a folder structure to store your backup. Then create a script to use rsync to sync the remote files and database to a local mirror and then rdiff-backup to make a versioned backup of the mirror.

client$ mkdir -p ~/Backup/DreamHost/SITE1/mirror/db #Make folder with and parent folders
client$ mkdir ~/Backup/DreamHost/SITE1/mirror/www
client$ mkdir ~/Backup/DreamHost/SITE1/history
client$ touch ~/Backup/DreamHost/SITE1/pull.sh
client$ chmod +x ~/Backup/DreamHost/SITE1/pull.sh
client$ nano ~/Backup/DreamHost/SITE1/pull.sh

~/Backup/DreamHost/SITE1/pull.sh

#!/bin/bash

function announce {
   echo " "
   echo " "
   echo ------------------------------------------------------------
   echo $*
   date
   echo ------------------------------------------------------------
   echo " "
}

cd `dirname $0`

announce "Starting up"
echo Working path:
pwd

announce "Writing remote database backup"
ssh dh-SITE1-backup db_backup/db-dump.sh
announce "Syncing files"
rsync --progress --archive --rsh=ssh dh-SITE1-backup:"~/SITE1_WEB_FOLDER" mirror/www
announce "Syncing database"
rsync --progress --archive --rsh=ssh dh-SITE1-backup:"~/db_backup/database.sql" mirror/db/database.sql

announce "Performing versioned backup"
rdiff-backup mirror history

announce "Done"

Make sure you fill in the values for SITE1 and SITE1_WEB_FOLDER according to your site's name and its path on the server.

Run pull.sh and make sure you get a full backup in mirror and a clone in history. (Future runs of the script will refresh the clone in history while saving a linked list of diffs in the same folder working back from present time.)

Since this script will be running from a cron task, it's good to create a wrapper script that redirects all output to a log file, so you can see what went wrong if it fails.

client$ touch ~/Backup/DreamHost/SITE1/run.sh
client$ chmod +x ~/Backup/DreamHost/SITE1/run.sh
client$ nano ~/Backup/DreamHost/SITE1/run.sh

~/Backup/DreamHost/SITE1/run.sh

#!/bin/bash

cd `dirname $0`
./pull.sh &>backup.log

Run that file to make sure you get a log file.

Repeat this entire step for any additional sites you have if you are backing up more than one site.

Step 5: Create a Cron Job

The last step is to create a wrapper of the wrapper scripts to run all of them from a single scheduled cron task.

client$ touch ~/Backup/DreamHost/run-all.sh
client$ chmod +x ~/Backup/DreamHost/run-all.sh
client$ nano ~/Backup/DreamHost/run-all.sh

~/Backup/DreamHost/run-all.sh

#!/bin/bash

`dirname $0`/SITE1/run.sh
`dirname $0`/SITE2/run.sh
# etc.

Now pick a random time early in the morning -- it's probably better not to pick an exact '00' or '30' minutes past an hour. I chose 03:13 in the client's local time. Create a cron task:

client$ crontab -e
13 3 * * * /home/MY_USERNAME/Backup/DreamHost/run-all.sh

Make sure you fill in MY_USERNAME. For exact syntax of the time specifier fields in the crontab file, see the man page for the file format.

Now you should be all set. Final test: wait until tomorrow and check to make sure it ran correctly on schedule.

Comments

Add Comment

* Required information
5000
Powered by Commentics

Comments

No comments yet. Be the first!