Proper backups are like eating your vegetables -- we all say we'll do it and that it is a good idea, but it is so much easier NOT to do it and eat Oreo cookies instead. Then you wake up one day, are 25 years old and are a really picky eater and annoy your boyfriend because you won't go eat at the Indian place he loves that doesn't have a menu but only serves vegetarian stuff that scares you. And the people at Subway give you dirty looks when you tell them you don't want anything on your sandwich. Don't risk losing your website because you didn't bother backing up.
Update: I posted a video tutorial that walks through all of these steps
here. I still recommend reading through this page because the video tutorial assumes that you will be following these steps.
This a tutorial for creating an automated back-up system for
(mt) Media Temple's (gs) Grid Service. Although it will almost certainly work on other servers and configurations, this is written for users who are on the Grid who want an easy way to do automated backups. I personally feel most comfortable having my most important files backed-up offsite, so I use
Amazon's S3 service. S3 is fast, super cheap (you only pay for what you use) and reliable. I use S3 to store my website backups and my most important computer files. I spend about $1.50 a month, and that is for nearly 10 GBs of storage.
You can alter the script to simply store the data in a separate location on your server (where you can then just FTP or SSH in and download the compressed archive), but this process is assuming that you are using both the (gs) and S3.
This tutorial assumes that you know how to login to your (gs) via SSH using either the Terminal in OS X or Linux or
PuTTY for Windows. If SSH is still confusing, check out
(mt)'s Knowledge Base article and take a deep breath. It looks more scary than it really is.
Acknowledgements
I would be remiss if I didn't give a GIGANTIC shout-out to
David at Stress Free Zone and
Paul Stamatiou (I met
Paul at the Tweet-up in March) who both wrote great guides to backing stuff up server side to S3. I blatantly stole from both of them and rolled my own script that is a combination of the two. Seriously, thank you both for your awesome articles.
Furthermore, none of this would even be possible without the brilliant
S3Sync Ruby utility.
Installing S3Sync
Although PHP and Perl script exist to connect with the S3 servers, the Ruby solution that the
S3Sync dudes created is much, much better.
The (gs) already has Ruby on it (version 1.8.5 as of this writing), which is up-to-date enough for S3Sync.
OK, so log-in to your (gs) via SSH. My settings (and the defaults for (gs), I assume) are to place you in the .home directory as soon as you login to SSH.
Once you are at the command line, type in the following command:
wget
http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
This will download the latest S3Sync tarball to your .home folder
tar xvzf s3sync.tar.gz
This uncompresses the archive to its own directory.
rm s3sync.tar.gz
cd s3sync
mkdir certs
cd certs
wget
http://mirbsd.mirsolutions.de/cvs.cgi/~checkout~/src/etc/ssl.certs.shar
sh ssl.certs.shar
cd ..
mkdir s3backup
That will delete the compressed archive, make a directory for certificates (certs), download an SSL certificate generator script, execute that script and create a backup directory within the s3sync directory called "s3backup."
Now, all you need to do is edit two files in your newly created s3sync folder. You can use TextEdit, TextMate, NotePad or any other text editor to edit these files. You are only going to be changing a few of the values.
I edited the files via Transmit, but you can use vi straight from the command line if you are comfortable.
The first file you want to edit is called s3config.yml.sample
You want to edit that file so that the aws_access_key and aws_secret_access_key fields correspond to those from your S3 account. You can find those in the Access Information area after logging into Amazon.com's Web Services page.
Make sure that the ssl_cert_dir: has the following value (if you created your s3sync folder in the .home directory):
/home/xxxxx/users/.home/s3sync/certs were xxxxx is the name of your server.
You can get your entire access path by typing in
pwd
at the command line.
Save that file as s3config.yml
The next step is something I had to do in order to get the s3 part of the script to connect, but it may not be required for all server set-ups, but it was for the (gs).
Edit the s3config.rb file so that the area that says
confpath = [xxxxx]
looks like this
confpath = ["./", "#{ENV['S3CONF']}", "#{ENV['HOME']}/.s3conf", "/etc/s3conf"]
Writing the backup script (or editing mine)
OK, that was the hard part. The rest is pretty simple.
I created the following backup script called,
"backup_server.sh" This script will backup the content of the domain directories you specify (because if you are like me, some of your domain folders are really just symlinks) and all of your MySQL databases. It will then upload each directory and database in its own compressed archive to the S3 Bucket of your choice. Buckets are unique, so create a Bucket using either the S3Fox tool or Transmit or another S3 manager that is specific for your website.
This is the content of the script:
#!/bin/sh
# A list of website directories to back up
websites="site1.com site2.com site3.com"
# The destination directory to backup the files to
destdir=/home/xxxxx/users/.home/s3sync/s3backup
# The directory where all website domain directories reside
domaindir=/home/xxxxx/users/.home/domains
# The MySQL database hostname
dbhost=internal-db.sxxxxx.gridserver.com
# The MySQL database username - requires read access to databases
dbuser=dbxxxxx
# The MySQL database password
dbpassword=xxxxxxx
echo `date` ": Beginning backup process..." > $destdir/backup.log
# remove old backups
rm $destdir/*.tar.gz
# backup databases
for dbname in `echo 'show databases;' | /usr/bin/mysql -h $dbhost -u$dbuser -p$dbpassword`
do
if [ $dbname != "Database" ];
then
echo `date` ": Backing up database $dbname..." >> $destdir/backup.log
/usr/bin/mysqldump --opt -h $dbhost -u$dbuser -p$dbpassword $dbname > $destdir/$dbname.sql
tar -czf $destdir/$dbname.sql.tar.gz $destdir/$dbname.sql
rm $destdir/$dbname.sql
fi
done
# backup web content
echo `date` ": Backing up web content..." >> $destdir/backup.log
for website in $websites
do
echo `date` ": Backing up website $website..." >> $destdir/backup.log
tar -czf $destdir/$website.tar.gz $domaindir/$website
done
echo `date` ": Backup process complete." >> $destdir/backup.log
# The directory where s3sync is installed
s3syncdir=/home/xxxxx/users/.home/s3sync
# The directory where the backup archives are stored
backupdir=/home/xxxxx/users/.home/s3sync/s3backup
# The S3 bucket a.k.a. directory to upload the backups into
s3bucket=BUCKET-NAME
cd $s3syncdir
./s3sync.rb $backupdir/ $s3bucket:
For
(mt) Media Temple (gs) Grid Server users, you just need to change the "site1.com" values to your own domains (you can do as many as you want) and substitute all the places where marked "xxxxx" with your server number (again, you can find this by entering "pwd" at the command line) and with your database password (which is visible in the (mt) control panel under the "Database" module.
Make sure you change the value at the end of the script that says "BUCKET-NAME" to the name of the S3 Bucket you want to store you backups in.
Now that you have edited the script, upload it to your /data directory.
Change the permissions (you can do this either via SSH
chmod a+x backup_server.sh
or using your FTP client to 755.
Now, test the script.
In the command line type this in:
cd data
./backup_server.sh
And watch the magic. Assuming everything was correctly input, an archived version of all your domain directories and all of your MySQL databases will be put in a folder called "s3backup" and then uploaded directly to your S3 server. Next time you run the script, the backup files will be replaced.
Check to make sure that the script is working the way you want it to work.
Automate the script
You can either run the script manually from the command line or set it to run automatically. I've set mine to run each night at midnight. To set up the cron job, just click on the Cron Jobs button in the (mt) Admin area:
Uploaded with
plasq's
Skitch!
and set you parameters. The path for your script is: /home/xxxxx/data/backup_server.sh.
Enjoy your backups!
One note: The compressed domain archives retain their entire directory structure, as such, there is a .home directory that may not appear in Finder or Windows Explorer unless you have invisible or hidden files turned on. Don't worry, all your data is still retained in those archives.
Update (7/27/2008):
If you are getting an error that says something like
Permanent redirect received. Try setting AWS_CALLING_FORMAT to SUBDOMAIN
Add the following array to your s3config.yml file
AWS_CALLING_FORMAT: SUBDOMAIN
The error is either because your bucket is in the EU or there is something else funky with its URL structure. Changing that value should allow the script to perform as intended.
Originally published at
www.ChristinaWarren.com. You can comment here or
there.