Published: | |
Updated: |
My wife and I have a number of WordPress sites (including this one) hosted by DreamHost's inexpensive shared web hosting service. Two of those sites are newly launched this month and could be vulnerable to all kinds of operator errors due to changing wishes and requirements.
Clearly, I need a daily backup of both sites. I don't want to rely on the vendor's backup and technical support procedures if my wife and I break something.
This tutorial will show you how to deploy a versioned backup solution for web sites running in a shared web hosting environment such as DreamHost.
Update (11 July 2016): The instructions on this page show you how to use rdiff-backup without having it installed on the server that has the data you want backed up. Today I created a script to help you install rdiff-backup on the server (if it's running Ubuntu) even if you don't have root access.
In the hosting environment at your vendor, you need:
rsync
ssh
servermysqldump
(or other tool for making an SQL dump of your database if not using MySQL)You need another machine on the Internet to perform the backup. It could be in your home or your office.
rsync
ssh
clientcron
rdiff-backup
performs incremental backups of one folder into another, keeping a full copy of the most recent snapshot, and incremental history working back from the full backup. Only changes from the current version are stored, and these diffs are stored efficiently to remove any redundant blocks of data.
rdiff-backup
is available for most operating systems, but it typically doesn't come already installed on a shared web hosting account; DreamHost is no exception. My solution runs rdiff-backup
on my personal server that does the backup work, but it needs a fresh copy of the data to work on, so we use rsync
running on the client and the server over ssh
to make a mirror copy.
The scripts later on in this article are more readable if you put ssh
parameters into a config file ahead of time. Here's what my config file looks like on the client (the personal server):
config
Host dh-SITE1-backup
HostName DOMAIN1.TLD
User USER1
Host dh-SITE2-backup
HostName DOMAIN2.TLD
User USER2
# etc.
Now, when your script makes an ssh
connection to dh-SITE1-backup
it doesn't have to specify the hostname, username, and any other parameters right there in the ssh
command line invokation.
In order to run rsync
over ssh
without being prompted for a password, you need to create a public/private key pair and install the public key on the server, in each account on the server where there is a site you're backing up.
First, on the client (your personal server), run this command if you haven't already created a key pair:
client$ ssh-keygen
Accept all the defaults. Don't set a password --- or else your backup script won't be able to run unattended.
Now copy the public key to all the shell accounts on servers you are backing up:
client$ ssh-copy-id dh-SITE1-backup
USER1@DOMAIN1.TLD's password:
Now try logging into the machine, with "ssh 'dh-SITE1-backup'", and check in:
~/.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
client$ ssh-copy-id dh-SITE2-backup
USER2@DOMAIN2.TLD's password:
Now try logging into the machine, with "ssh 'dh-SITE2-backup'", and check in:
~/.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
client$ ssh dh-SITE1-backup
server$ mkdir ~/db_backup
server$ touch ~/db_backup/db-dump.sh
server$ chmod +x ~/db_backup/db-dump.sh
server$ nano ~/db_backup/db-dump.sh #Edit the script in nano -- CTRL-X to save and quit
~/db_backup/db-dump.sh
#!/bin/bash
SCRIPT_PATH=`dirname $0`
DUMP_PATH=`readlink -f $SCRIPT_PATH`/database.sql
mysqldump --host=MYSQL_HOSTNAME --user=MYSQL_USERNAME --password="MYSQL_PASSWORD" MYSQL_DB_NAME >"$DUMP_PATH"
Make sure you fill in the all-caps meta variables above starting with MYSQL_
.
If you are using a database engine other than MySQL, lookup how to dump the database to a file and change the script accordingly.
Run ~/db_backup/db-dump.sh
on the server and make sure you get an SQL file in that folder.
Repeat this entire step for any additional sites you have if you are backing up more than one site.
On the client, create a folder structure to store your backup. Then create a script to use rsync
to sync the remote files and database to a local mirror and then rdiff-backup
to make a versioned backup of the mirror.
client$ mkdir -p ~/Backup/DreamHost/SITE1/mirror/db #Make folder with and parent folders
client$ mkdir ~/Backup/DreamHost/SITE1/mirror/www
client$ mkdir ~/Backup/DreamHost/SITE1/history
client$ touch ~/Backup/DreamHost/SITE1/pull.sh
client$ chmod +x ~/Backup/DreamHost/SITE1/pull.sh
client$ nano ~/Backup/DreamHost/SITE1/pull.sh
~/Backup/DreamHost/SITE1/pull.sh
#!/bin/bash
function announce {
echo " "
echo " "
echo ------------------------------------------------------------
echo $*
date
echo ------------------------------------------------------------
echo " "
}
cd `dirname $0`
announce "Starting up"
echo Working path:
pwd
announce "Writing remote database backup"
ssh dh-SITE1-backup db_backup/db-dump.sh
announce "Syncing files"
rsync --progress --archive --rsh=ssh dh-SITE1-backup:"~/SITE1_WEB_FOLDER" mirror/www
announce "Syncing database"
rsync --progress --archive --rsh=ssh dh-SITE1-backup:"~/db_backup/database.sql" mirror/db/database.sql
announce "Performing versioned backup"
rdiff-backup mirror history
announce "Done"
Make sure you fill in the values for SITE1
and SITE1_WEB_FOLDER
according to your site's name and its path on the server.
Run pull.sh
and make sure you get a full backup in mirror
and a clone in history
. (Future runs of the script will refresh the clone in history
while saving a linked list of diffs in the same folder working back from present time.)
Since this script will be running from a cron task, it's good to create a wrapper script that redirects all output to a log file, so you can see what went wrong if it fails.
client$ touch ~/Backup/DreamHost/SITE1/run.sh
client$ chmod +x ~/Backup/DreamHost/SITE1/run.sh
client$ nano ~/Backup/DreamHost/SITE1/run.sh
~/Backup/DreamHost/SITE1/run.sh
#!/bin/bash
cd `dirname $0`
./pull.sh &>backup.log
Run that file to make sure you get a log file.
Repeat this entire step for any additional sites you have if you are backing up more than one site.
The last step is to create a wrapper of the wrapper scripts to run all of them from a single scheduled cron task.
client$ touch ~/Backup/DreamHost/run-all.sh
client$ chmod +x ~/Backup/DreamHost/run-all.sh
client$ nano ~/Backup/DreamHost/run-all.sh
~/Backup/DreamHost/run-all.sh
#!/bin/bash
`dirname $0`/SITE1/run.sh
`dirname $0`/SITE2/run.sh
# etc.
Now pick a random time early in the morning -- it's probably better not to pick an exact '00' or '30' minutes past an hour. I chose 03:13 in the client's local time. Create a cron task:
client$ crontab -e
13 3 * * * /home/MY_USERNAME/Backup/DreamHost/run-all.sh
Make sure you fill in MY_USERNAME
. For exact syntax of the time specifier fields in the crontab
file, see the man page for the file format.
Now you should be all set. Final test: wait until tomorrow and check to make sure it ran correctly on schedule.
This policy contains information about your privacy. By posting, you are declaring that you understand this policy:
This policy is subject to change at any time and without notice.
Reader-contributed comments on Glump.net are owned by their original authors, who reserve all rights.
Comments rules:
Comments