Linux: Backup Strategies
From ReceptiveIT
Contents |
What is a Backup?
So, what is a backup? In simple terms, it is a complete seperate copy of all your important data. In real terms, a backup is one string in your bow of risk mitigation, and other tasks such as RAID, are outside of the scope of this article.
All of us know that we need to do backups, but as they we won't need them just yet, they are often overlooked until it is too late.
In most real world scenarios, a backup will occur once every 24 hour period of a given work week. This will mean, that if the computer happens to catch fire, we would lose, at most, 24 hours worth of work. That, for most people, is an acceptable risk.
Where do you keep your backups? Some people will buy a separate hard disk to perform the backups, but if the fire happens to spread and burn down the whole building, then you have lost your data and your backup in one hit. In most real world scenarios, there is always an off-site component to the backups. Either a staff member will take some tapes home, or like in the following article, the data is automatically synchronised over the internet to another location.
In this example, we will have the following backup schedule
- Hourly, we will synchronise our important data to an off-site computer
- Daily, we will back up all our data to a separate hard drive that is permanently attached, but not part of the same RAID array
- Weekly, we will archive a single backup to an archive server
Hourly
The Problem
- We need to create a backup script that will take a list of important directories and copy them to an off-site server.
- We need to make sure that the script isn't running more than once on the same machine
- We need to make sure that people know when the script has succeeded, or failed.
- Some directories cannot be backed up without either stopping the services, or taking a snapshot of the filesystem.
The Solution
- I will create a BASH script that will automate the task of backing up, but keep the script generic with all configuration in separate configuration files.
- I will make sure that the script enters an endless loop, and the script will be initiated with Upstart (You could also use inittab if your distribution does not have Upstart)
- I will build email notifications into the script
- I will use LVM to take a snapshot of the path in question before backing up the data, this way the data doesn't change half way through the backup process
Script Dependencies
The Disaster Recovery Server
Directories
- I will assume that you have a directory called /netsync and it has sufficient space to receive a backup
Packages
apt-get install rsync
Configuration Files
vi /etc/rsyncd.conf
# GLOBAL OPTIONS #motd file=/etc/motd #log file=/var/log/rsyncd # for pid file, dont' use /var/run/rsync.pid unless you're not going to run # rsync out of the init.d script. The /var/run/rsyncd.pid below is OK. pid file=/var/run/rsyncd.pid #syslog facility=daemon #socket options= # Netsync module [server-backup] comment = Location for off-site backup path = /netsync/server-backup hosts allow = 192.168.10.10 read only = no list = yes uid=0 gid=0 dont compress = *.gz *.tgz *.zip *.z *.rpm *.deb *.iso *.bz2 *.tbz
vi /etc/default/rsync
Change RSYNC_ENABLE to be true
/etc/init.d/rsync start
Hooray! You now have a listening rsync server.
The Office Server
Assumptions
- Linux distribution is Ubuntu Server 10.04 64-bit
- Filesystem is XFS
- Filesystems are stored in LVM containers
- There is at least 15G of unallocated space in the LVM volume group
Directories
mkdir -p /etc/backup mkdir -p /var/log/backup mkdir -p /mnt/sync-snapshot
Packages
apt-get install rsync bsd-mailx gawk
Configuration Files
vi /etc/backup/sync.conf SYNC_NOTIFY="monitoring@domain.com" SYNC_SOURCES="/etc/ /var/lib/cyrus/ /var/spool/cyrus/ /usr/local/bin/" SYNC_TARGETS="offsite.domain.com" SYNC_MODULE="server-backup" SYNC_FREQUENCY="3600"
Snapshot selection
Mail servers like Cyrus have files that change all the time. We will need to snapshot the Cyrus directories before making a backup
touch /var/lib/cyrus/.sync-snapshot touch /var/spool/cyrus/.sync-snapshot
The Script
First create the Upstart launcher script
vi /etc/init/sync.conf
# server sync start on mounted stop on runlevel [01456] respawn exec /usr/local/bin/syncserver.sh
Then create the sync script itself
vi /usr/local/bin/syncserver.sh
#!/bin/bash
# Hotsync Script
#
defaults () {
# Defaults
EXIT_CODE=""
SNAPSHOT_MAGIC=".sync-snapshot"
SNAPSHOT_PATH="/mnt/sync-snapshot/"
SNAPSHOT_SIZE="15G"
SNAPSHOT_MOUNTOPTIONS="ro,nouuid"
SNAPSHOT_LVNAME="syncsnap"
SYNC_USERNAME=""
SYNC_PASSWORD_FILE=""
SYNC_MODULE=""
SYNC_CONFDIR="/etc/backup"
SYNC_CONF_FILE="sync.conf"
SYNC_LOGDIR="/var/log/backup"
SYNC_LOG="/var/log/backup/sync.log"
SYNC_NOTIFY="root"
SYNC_HOSTNAME=`hostname`
SYNC_FREQUENCY=3600
SYNC_INCNUM=0
SYNC_MAXINC=24
SYNC_TIMEOUT=1800
SYNC_SUCCESS_FILE="/var/log/backup/.sync-success"
}
function rsync_dereference () {
ERROR_CODE=""
case $2 in
0)
ERROR_CODE="Success. Move along, nothing to see here!"
;;
1)
ERROR_CODE="Syntax or usage error"
;;
2)
ERROR_CODE="Protocol incompatibility"
;;
3)
ERROR_CODE="Errors selecting input/output files, dirs"
;;
4)
ERROR_CODE="Requested action not supported: an attempt was made to manipulate 64-bit files on a platform that can- not support them; or an option was specified that is supported by the client and not by the server."
;;
5)
ERROR_CODE="Error starting client-server protocol"
;;
6)
ERROR_CODE="Daemon unable to append to log-file"
;;
10)
ERROR_CODE="Error in socket I/O"
;;
11)
ERROR_CODE="Error in file I/O"
;;
12)
ERROR_CODE="Error in rsync protocol data stream"
;;
13)
ERROR_CODE="Errors with program diagnostics"
;;
14)
ERROR_CODE="Error in IPC code"
;;
20)
ERROR_CODE="Received SIGUSR1 or SIGINT"
;;
21)
ERROR_CODE="Some error returned by waitpid()"
;;
22)
ERROR_CODE="Error allocating core memory buffers"
;;
23)
ERROR_CODE="Partial transfer due to error"
;;
24)
ERROR_CODE="Partial transfer due to vanished source files"
;;
25)
ERROR_CODE="The --max-delete limit stopped deletions"
;;
30)
ERROR_CODE="Timeout in data send/receive"
;;
*)
ERROR_CODE="Unknown error. Really, I don't know!"
;;
esac
ERROR_CODE="(${2}) ${ERROR_CODE}"
eval "$1=\"${ERROR_CODE}\""
}
# Populate default variables
defaults
# Datestamp of when the sync process started
date >> $SYNC_LOG
# Get configuration
if [ ! -s ${SYNC_CONFDIR}/${SYNC_CONF_FILE} ]
then
echo "Sync configuration does not exist!" >> $SYNC_LOG
cat $SYNC_LOG | mail -s"BACKUP FAILED" $SYNC_NOTIFY
exit 1
fi
# Source the backup configuration
. ${SYNC_CONFDIR}/${SYNC_CONF_FILE}
# Go Into a loop....makes sure sync doesn't overlap.
while true
do
CURRENTHOUR=`date +"%H"`
SYNCERR=0
DATE=`date +"%F %R"`
EPOCHDATE=`date +%s`
EPOCHDATE=$(($EPOCHDATE - $(($EPOCHDATE % 86400))))
TODAY=`echo | gawk 'BEGIN {print strftime("%F", ARGV[1])}' $EPOCHDATE`
EPOCHLAST=$(($EPOCHDATE - 86400))
YESTERDAY=`echo | gawk 'BEGIN {print strftime("%F", ARGV[1])}' $EPOCHLAST`
DAYOFWEEK=`date +%u`
THISSTAMP=`date +"%Y%m%d%H%M"`
EPOCHNOW=`date +%s`
SYNC_INCNUM=$(($SYNC_INCNUM + 1))
if [ $SYNC_INCNUM -gt $SYNC_MAXINC ]
then
SYNC_INCNUM=0
fi
LOG="$SYNC_LOGDIR/${SYNC_HOSTNAME}-sync-${SYNC_INCNUM}.log"
FULL_LOG="$SYNC_LOGDIR/${SYNC_HOSTNAME}-sync-${SYNC_INCNUM}-full.log"
echo "Full sync $DATE." | tee $LOG > $FULL_LOG
for SYNC_SOURCE in $SYNC_SOURCES
do
# Check to see if the source path needs to have a snapshot taken
if [ -e /${SYNC_SOURCE}/${SNAPSHOT_MAGIC} ]
then
echo "Creating snapshot for ${SYNC_SOURCE} before sync..." | tee $LOG >> $FULL_LOG
# Find mountpoint
SNAPSHOT_DIRSPLIT=(`echo ${SYNC_SOURCE} | tr '/' ' '`)
LEN=${#SNAPSHOT_DIRSPLIT[@]}
for (( i=${#SNAPSHOT_DIRSPLIT[@]}+1; i>1; i-- ));
do
SNAPSHOT_TEST=`echo ${SYNC_SOURCE} | cut -d '/' -f1-${i}`
mountpoint -q "${SNAPSHOT_TEST}"
if [ $? -eq 0 ]
then
# Mountpoint found
SNAPSHOT_MOUNTPOINT=${SNAPSHOT_TEST}
break
fi
done
# Find device for mountpoint
SNAPSHOT_DMDEVICE=`df ${SNAPSHOT_MOUNTPOINT} | awk 'NR==2 {print $1}'`
SNAPSHOT_VGNAME=`echo ${SNAPSHOT_DMDEVICE:12} | cut -d '-' -f1`
SNAPSHOT_SOURCELV=`echo ${SNAPSHOT_DMDEVICE:12} | cut -d '-' -f2`
lvcreate -s -n ${SNAPSHOT_LVNAME} -L+${SNAPSHOT_SIZE} /dev/${SNAPSHOT_VGNAME}/${SNAPSHOT_SOURCELV}
mount -o ${SNAPSHOT_MOUNTOPTIONS} /dev/${SNAPSHOT_VGNAME}/${SNAPSHOT_LVNAME} ${SNAPSHOT_PATH}
SNAPSHOT=1
else
echo "Snapshot not required for ${SYNC_SOURCE} to sync..." | tee $LOG >> $FULL_LOG
SNAPSHOT=0
fi
for SYNC_TARGET in $SYNC_TARGETS
do
# Reset working variables
HOST_USERNAME=""
HOST_PASSWORD_FILE=""
HOST_MODULE=""
RSYNC_USERNAME=""
RSYNC_PASSWORD_FILE=""
RSYNC_MODULE=""
# Check to see if there is a custom config for this sync target
if [ -e /${SYNC_CONFDIR}/${SYNC_TARGET}.conf ]
then
# Source the host specific backup configuration
echo ${SYNC_CONFDIR}/${SYNC_TARGET}.conf
. ${SYNC_CONFDIR}/${SYNC_TARGET}.conf
# Populate host specific options
if [ ${HOST_USERNAME} ]
then
RSYNC_USERNAME="${HOST_USERNAME}@"
fi
if [ ${HOST_PASSWORD_FILE} ]
then
RSYNC_PASSWORD_FILE="--password-file ${HOST_PASSWORD_FILE}"
fi
if [ ${HOST_MODULE} ]
then
RSYNC_MODULE="${HOST_MODULE}"
fi
else
if [ ${SYNC_USERNAME} ]
then
RSYNC_USERNAME="${SYNC_USERNAME}@"
else
RSYNC_USERNAME=""
fi
if [ ${SYNC_PASSWORD_FILE} ]
then
RSYNC_PASSWORD_FILE="--password-file ${SYNC_PASSWORD_FILE}"
else
RSYNC_PASSWORD_FILE=""
fi
if [ ${SYNC_MODULE} ]
then
RSYNC_MODULE="${SYNC_MODULE}"
else
echo "Sync module does not exist!" >> $SYNC_LOG
cat $SYNC_LOG | mail -s"BACKUP FAILED" $SYNC_NOTIFY
exit 1
fi
fi
echo -n "Syncing $SYNC_SOURCE to $SYNC_TARGET..." | tee -a $LOG >> $FULL_LOG
if [ $SNAPSHOT -eq 1 ]
then
RSYNC_SOURCE=${SNAPSHOT_PATH}/./${SYNC_SOURCE}
else
RSYNC_SOURCE=${SYNC_SOURCE}
echo
fi
# Perform Sync
rsync -auR -v --timeout=$SYNC_TIMEOUT --delete ${RSYNC_PASSWORD_FILE} $SYNC_SOURCE ${RSYNC_USERNAME}${SYNC_TARGET}::${RSYNC_MODULE}/${SYNC_HOSTNAME} >> $FULL_LOG 2>&1
# Catch errors
EXIT_CODE=$?
if [ ${EXIT_CODE} -ne 0 ]
then
# Dereference Rsync error code
rsync_dereference EXIT_CODE ${EXIT_CODE}
echo "ERROR ${EXIT_CODE}" | tee -a $LOG >> $FULL_LOG
SYNCERR=1
else
echo "OK" | tee -a $LOG >> $FULL_LOG
fi
# Unmount snapshot if required
if [ $SNAPSHOT -eq 1 ]
then
echo -n "Unmounting and removing snapshot..." | tee -a $LOG >> $FULL_LOG
umount ${SNAPSHOT_PATH}
lvremove -f /dev/${SNAPSHOT_VGNAME}/${SNAPSHOT_LVNAME}
fi
done
done
if [ $SYNCERR -gt 0 ]
then
echo "Sync was not completed successfully." | tee -a $LOG >> $FULL_LOG
cat $LOG | mail -s "$THISSTAMP Sync Failed $SYNC_INCNUM - ${SYNC_HOSTNAME}" $SYNC_NOTIFY
else
echo "Sync SUCCESSFUL!" | tee -a $LOG >> $FULL_LOG
touch $SYNC_SUCCESS_FILE
fi
if [ $SYNC_INCNUM -eq $SYNC_MAXINC ]
then
cat $SYNC_LOGDIR/${SYNC_HOSTNAME}-sync-?.log $SYNC_LOGDIR/${SYNC_HOSTNAME}-sync-??.log | mail -s "$THISSTAMP Sync digest $SYNC_INCNUM - ${SYNC_HOSTNAME}" $SYNC_NOTIFY
fi
sleep ${SYNC_FREQUENCY}
done
Start the sync
initctl start sync
Now all you have to do is watch the log files in /var/log/backup and check your emails at monitoring@domain.com
Multiple Targets
- This script can support multiple targets. That is, it will rsync the contents of the source directory, to each server specified in sync.conf as SYNC_TARGETS
- You can specify a configuration file per target to set things like username, password and module name.
HOST_USERNAME="backup-user" HOST_PASSWORD_FILE="/etc/backup/secure.conf" HOST_MODULE="rsync_backup"
Conclusion
The script has hopefully told you that it has successfully backed up your data, and you should be able to confirm that the data is also on the disaster recovery server.
- The script is always running. It will sync, then sleep for SYNC_FREQUENCY, and then start all over again.
- If the directory needs to have a snapshot taken before a sync, simply touch SNAPSHOT_MAGIC in that directory. If the file exists, a snapshot will be performed
- Snapshot size is by default 15G, but can be overridden by setting SNAPSHOT_SIZE
- I use XFS as my preferred filesystem. SNAPSHOT_MOUNTOPTIONS might need to be tweaked for other filesystems.
- If there was a problem with the sync, the script will send an email before sleeping.
- If the sync succeeded, the script will touch SYNC_SUCCESS_FILE. This is useful if you use pro-active monitoring like Zabbix.
- A digest email will be sent out every so often. The exact frequency can be calculated by SYNC_MAXINC x SYNC_FREQUENCY

