Using Amazon S3 to backup Media Temple’s Grid (gs)

Jun 24, 2008 18:00


Proper backups are like eating your vegetables -- we all say we'll do it and that it is a good idea, but it is so much easier NOT to do it and eat Oreo cookies instead. Then you wake up one day, are 25 years old and are a really picky eater and annoy your boyfriend because you won't go eat at the Indian place he loves that doesn't have a menu but only serves vegetarian stuff that scares you. And the people at Subway give you dirty looks when you tell them you don't want anything on your sandwich. Don't risk losing your website because you didn't bother backing up.

Update: I posted a video tutorial that walks through all of these steps here. I still recommend reading through this page because the video tutorial assumes that you will be following these steps.

This a tutorial for creating an automated back-up system for (mt) Media Temple's (gs) Grid Service. Although it will almost certainly work on other servers and configurations, this is written for users who are on the Grid who want an easy way to do automated backups. I personally feel most comfortable having my most important files backed-up offsite, so I use Amazon's S3 service. S3 is fast, super cheap (you only pay for what you use) and reliable. I use S3 to store my website backups and my most important computer files. I spend about $1.50 a month, and that is for nearly 10 GBs of storage.

You can alter the script to simply store the data in a separate location on your server (where you can then just FTP or SSH in and download the compressed archive), but this process is assuming that you are using both the (gs) and S3.

This tutorial assumes that you know how to login to your (gs) via SSH using either the Terminal in OS X or Linux or PuTTY for Windows. If SSH is still confusing, check out (mt)'s Knowledge Base article and take a deep breath. It looks more scary than it really is.
Acknowledgements

I would be remiss if I didn't give a GIGANTIC shout-out to David at Stress Free Zone and Paul Stamatiou (I met Paul at the Tweet-up in March) who both wrote great guides to backing stuff up server side to S3. I blatantly stole from both of them and rolled my own script that is a combination of the two. Seriously, thank you both for your awesome articles.

Furthermore, none of this would even be possible without the brilliant S3Sync Ruby utility.
Installing S3Sync

Although PHP and Perl script exist to connect with the S3 servers, the Ruby solution that the S3Sync dudes created is much, much better.

The (gs) already has Ruby on it (version 1.8.5 as of this writing), which is up-to-date enough for S3Sync.

OK, so log-in to your (gs) via SSH. My settings (and the defaults for (gs), I assume) are to place you in the .home directory as soon as you login to SSH.

Once you are at the command line, type in the following command:

wget http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
This will download the latest S3Sync tarball to your .home folder

tar xvzf s3sync.tar.gz
This uncompresses the archive to its own directory.

rm s3sync.tar.gz cd s3sync mkdir certs cd certs wget http://mirbsd.mirsolutions.de/cvs.cgi/~checkout~/src/etc/ssl.certs.shar sh ssl.certs.shar cd .. mkdir s3backup
That will delete the compressed archive, make a directory for certificates (certs), download an SSL certificate generator script, execute that script and create a backup directory within the s3sync directory called "s3backup."

Now, all you need to do is edit two files in your newly created s3sync folder. You can use TextEdit, TextMate, NotePad or any other text editor to edit these files. You are only going to be changing a few of the values.

I edited the files via Transmit, but you can use vi straight from the command line if you are comfortable.

The first file you want to edit is called s3config.yml.sample

You want to edit that file so that the aws_access_key and aws_secret_access_key fields correspond to those from your S3 account. You can find those in the Access Information area after logging into Amazon.com's Web Services page.

Make sure that the ssl_cert_dir: has the following value (if you created your s3sync folder in the .home directory):
/home/xxxxx/users/.home/s3sync/certs were xxxxx is the name of your server.

You can get your entire access path by typing in
pwd
at the command line.

Save that file as s3config.yml

The next step is something I had to do in order to get the s3 part of the script to connect, but it may not be required for all server set-ups, but it was for the (gs).

Edit the s3config.rb file so that the area that says
confpath = [xxxxx]
looks like this
confpath = ["./", "#{ENV['S3CONF']}", "#{ENV['HOME']}/.s3conf", "/etc/s3conf"] Writing the backup script (or editing mine)

OK, that was the hard part. The rest is pretty simple.

I created the following backup script called, "backup_server.sh" This script will backup the content of the domain directories you specify (because if you are like me, some of your domain folders are really just symlinks) and all of your MySQL databases. It will then upload each directory and database in its own compressed archive to the S3 Bucket of your choice. Buckets are unique, so create a Bucket using either the S3Fox tool or Transmit or another S3 manager that is specific for your website.

This is the content of the script:

#!/bin/sh # A list of website directories to back up websites="site1.com site2.com site3.com" # The destination directory to backup the files to destdir=/home/xxxxx/users/.home/s3sync/s3backup # The directory where all website domain directories reside domaindir=/home/xxxxx/users/.home/domains # The MySQL database hostname dbhost=internal-db.sxxxxx.gridserver.com # The MySQL database username - requires read access to databases dbuser=dbxxxxx # The MySQL database password dbpassword=xxxxxxx echo `date` ": Beginning backup process..." > $destdir/backup.log # remove old backups rm $destdir/*.tar.gz # backup databases for dbname in `echo 'show databases;' | /usr/bin/mysql -h $dbhost -u$dbuser -p$dbpassword` do if [ $dbname != "Database" ]; then echo `date` ": Backing up database $dbname..." >> $destdir/backup.log /usr/bin/mysqldump --opt -h $dbhost -u$dbuser -p$dbpassword $dbname > $destdir/$dbname.sql tar -czf $destdir/$dbname.sql.tar.gz $destdir/$dbname.sql rm $destdir/$dbname.sql fi done # backup web content echo `date` ": Backing up web content..." >> $destdir/backup.log for website in $websites do echo `date` ": Backing up website $website..." >> $destdir/backup.log tar -czf $destdir/$website.tar.gz $domaindir/$website done echo `date` ": Backup process complete." >> $destdir/backup.log # The directory where s3sync is installed s3syncdir=/home/xxxxx/users/.home/s3sync # The directory where the backup archives are stored backupdir=/home/xxxxx/users/.home/s3sync/s3backup # The S3 bucket a.k.a. directory to upload the backups into s3bucket=BUCKET-NAME cd $s3syncdir ./s3sync.rb $backupdir/ $s3bucket:
For (mt) Media Temple (gs) Grid Server users, you just need to change the "site1.com" values to your own domains (you can do as many as you want) and substitute all the places where marked "xxxxx" with your server number (again, you can find this by entering "pwd" at the command line) and with your database password (which is visible in the (mt) control panel under the "Database" module.

Make sure you change the value at the end of the script that says "BUCKET-NAME" to the name of the S3 Bucket you want to store you backups in.

Now that you have edited the script, upload it to your /data directory.

Change the permissions (you can do this either via SSH
chmod a+x backup_server.sh
or using your FTP client to 755.

Now, test the script.

In the command line type this in:

cd data ./backup_server.sh

And watch the magic. Assuming everything was correctly input, an archived version of all your domain directories and all of your MySQL databases will be put in a folder called "s3backup" and then uploaded directly to your S3 server. Next time you run the script, the backup files will be replaced.

Check to make sure that the script is working the way you want it to work.

Automate the script

You can either run the script manually from the command line or set it to run automatically. I've set mine to run each night at midnight. To set up the cron job, just click on the Cron Jobs button in the (mt) Admin area:



Uploaded with plasq's Skitch!

and set you parameters. The path for your script is: /home/xxxxx/data/backup_server.sh.

Enjoy your backups!

One note: The compressed domain archives retain their entire directory structure, as such, there is a .home directory that may not appear in Finder or Windows Explorer unless you have invisible or hidden files turned on. Don't worry, all your data is still retained in those archives.

Update (7/27/2008):
If you are getting an error that says something like
Permanent redirect received. Try setting AWS_CALLING_FORMAT to SUBDOMAIN

Add the following array to your s3config.yml file
AWS_CALLING_FORMAT: SUBDOMAIN

The error is either because your bucket is in the EU or there is something else funky with its URL structure. Changing that value should allow the script to perform as intended.

Originally published at www.ChristinaWarren.com. You can comment here or there.

software, web fun, technology, backup, media temple, tips, writing, how-tos, computers

Previous post Next post
Up