I have recently crafted a systematic way to backup my website. In particular, I need to backup
~/web (partially accessible by public) and ~/web_private (inaccessible by public).Here's how I have done it. Hope that the method is useful to you too.
It is assumed that you use MySQL database. Decide a time (10:58am daily in this example) for the automatic backup. Schedule it by invoking crontab -e and add the following line:
58 10 * * * mysqldump -u username -ppassword -h hostname database_name | bzip2 - > ~/web_private/dump.sql.bz2
This job creates a data file compressed with bzip2. Note that it is advised to backup the database when no cron job for the Drupal modules is running. So, don't choose a backup time that will overlap with these cron jobs. It's up to you to adjust the frequency of backing up. (E.g., weekly, daily, or hourly.)
The username, password, hostname, and the database_name above are the ones for accessing the Drupal database. These values can actually be obtained from the settings.php file for your Drupal site. A script that parses the settings.php file to ease your job is available from Drupal:
[~/web_private] \$ wget http://cvs.drupal.org/viewcvs/*checkout*/drupal/contributions/sandbox/drumm/tools/drupalsqldump.sh [~/web_private] \$ chmod +x drupalsqldump.sh
With the script, the cron job line should look like:
58 10 * * * ~/web_private/drupalsqldump.sh ~/web/main/sites/domain_name/settings.php | bzip2 - > ~/web_private/dump.sql.bz2
It is a good practice to backup important data to another machine at another location. I have written a script backup_web.sh that makes use of rsync to transfer important data from a remote server to a local machine:
#!/bin/sh # Backup my website which is powered by Drupal # - Mirror only all important files # Configurations #---------------------------------------------------------------- # Directories to backup on the remote server. Separate with a space. # Directory names with space characters are not properly handled yet! # DON'T include trailing slash. USERNAME=username HOST=domain_name_or_IP_address SOURCES='~/web ~/web_private' # Files to be excluded from the backup. Separate with a space. EXCLUDES='*~ files/tex/' # Directory to backup to. DON'T include trailing slash. TARGET=some_local_directory # Backup Operations #---------------------------------------------------------------- # Disable the headache filename expansion temporarily. set -f # Prepare the options for the files to be excluded EXCLUDE_OPTIONS='' for exclude in \$EXCLUDES; do EXCLUDE_OPTIONS+="--exclude='\$exclude' " done # The real stuff! eval rsync -acvz --delete --delete-excluded \$EXCLUDE_OPTIONS \$USERNAME@\$HOST:\'\$SOURCES\' \$TARGET
Please configure the variables highlighted with boldface type before using the script. Do chmod +x backup_web.sh to make it executable. There are a few ways to use this script. You can run it periodically on another machine used for backup purpose. You can also modify this script a bit such that it sends files to a remote machine (instead of fetching files remotely), and run it periodically on the web server. (Take a look at my rsync quick guide if you're not familiar with it.)
Then, choose a proper backup time and schedule an automatic backup with crontab -e by adding the following line:
08 11 * * * path_to_the_script/backup_web.sh >> ~/backup_web.log
This job logs the output to a file. (Omit it if you want.) I would recommend running this job not long after the backup of the Drupal database. It is set to be 10 minutes after the database backup in this example. Feel free to adjust the backup frequency to suit your need.
Special credit goes to xman for helping me to solve several tough problems when programming the backup_web.sh script.