Wednesday, November 12, 2008

Snapshots with PostgreSQL and Sun Storage 7000 Unified Storage Systems

Continuing on my earlier blog entry on setup of PostgreSQL using Sun Storage 7000  Unified Storage (Amber Road), let's now look at how to take snapshots for such systems. 


Considering a typical PostgreSQL system there are two ways to take backup: 


Cold or Offline backup: Backing of PostgreSQL while PostgreSQL server is  not running. The advantage of such a backup is that it is simple to achieve the purpose of backing up the database. However the disadvantage is that the database system is unavailable for work.  The general steps for doing a cold backup is, shutdown gracefully the PostgreSQL server and backup all the files including $PGDATA, pg_xlog if it is in different location and all the tablespaces location used for that server instance. After backup the files, start the PostgreSQL server again and you are done. In case if you ever want to go back to that earlier backup version, you shutdown the database, restore all the old files and restart the database.  This is essentially the same logic which we will follow with snapshots except if you now have a multi gigabyte database, the snapshot operation  will look like million times faster than having to do a "tar" or "cpio" (specially if you consider that "tar" is a single threaded application doing its best but still slow if you use it on multi-core systems).


Lets consider an example on how to take a snapshot with OpenStorage systems with Cold or Offline backup strategy.


Assume that Sun Storage 7000  Unified Storage System  has the  hostname "amberroad" (Sun Storage 7000 Unified Storage Systems is mouthful ) have a project defined "postgres" which has three devices exported "pgdata" which holds $PGDATA, "pglog" which holds $PGDATA/pg_xlog and "pgtbs1" which is one of the tablespaces defined.


Then a backup/snapshot script  (as postgres user ) will look something similar to the following (Disclaimer:  Provided AS-IS. You will have to modify it for your own setups):



#!/bin/sh
date
echo "---------------------------------------------------------------"
echo "Shutting down the PostgreSQL database..."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA stop
echo "---------------------------------------------------------------"
echo "Deleting earlier snapshot called testsnap.."
echo "---------------------------------------------------------------"
ssh -l root amberroad confirm shares select postgres snapshots select testsnap destroy
echo "---------------------------------------------------------------"
echo "Taking a new snapshot..."
echo "---------------------------------------------------------------"
ssh -l root amberroad shares select postgres snapshots snapshot testsnap
echo "---------------------------------------------------------------"
echo "Verifying the snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad shares select postgres snapshots show
ssh -l root amberroad shares select postgres snapshots select testsnap show
echo "---------------------------------------------------------------"
echo "Restarting the database.."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA start -l $PGDATA/server.log
echo "---------------------------------------------------------------"
date


Note: The testsnap should not exist otherwise snapshot operation will fail and hence we delete any earlier snaps called testsnap in the script. You may modify the script to suit your priorities.


Now in case you want to restore to the snapshot "testsnap" version
you can use a sample script as follows:



#!/bin/sh
date
echo "Restoring the database"
echo "---------------------------------------------------------------"
echo "Shutting down the database..."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA stop
echo "---------------------------------------------------------------"
echo "Restoring the snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad confirm shares select postgres select pgdata snapshots \
select testsnap rollback
ssh -l root amberroad confirm shares select postgres select pglog snapshots \
select testsnap rollback
ssh -l root amberroad confirm shares select postgres select pgtbs1 snapshots \
select testsnap rollback
echo "---------------------------------------------------------------"
echo "Restarting the database.."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA start -l $PGDATA/server.log
echo "---------------------------------------------------------------"
echo "Database restored to earlier snapshot: testsnap."
echo "---------------------------------------------------------------"
date



NOTE: When
you use the above script, all snapshots taken after “testsnap”
are lost as they are rolled back to the point in time when “testsnap”
was taken. There is another feature to clone the snapshot incase if you don't want to use rollback.



Hot or  Online backup:  A Hot or Online backup is when one takes a backup while PostgreSQL server is running. The advantage is that the database server is available. The disadvantage is that the setup is more complex than the Cold or Offline backup strategy. The recommended way to use Hot or Online backup with PostgreSQL is to use it in conjuction with PITR - Point in Time Recovery feature of PostgreSQL.


This can be achieved by turning on continuous WAL archiving in PostgreSQL and using   SELECT pg_start_backup('label');before taking the snapshot and then
issuing SELECT pg_stop_backup();
after the snapshot is completed.


For this you will also need another dedicated device under the "postgres" project to just hold the WAL archive files. (In my case I call it "pgwal" and mounted as /pgwal/archive 


To turn on continuous WAL archiving you need to set the following variables in postgresql.conf (and then restart the PostgreSQL server):



archive_mode = true
archive_command = 'test ! -f /var/lib/postgres/8.2/data/backup_in_progress || cp -i %p /pgwal/archive/%f < /dev/null'
archive_timeout=3600

Then a sample backup/snapshot with hot backup  script can look as follows:



#!/bin/sh
#PGDATA= /var/postgres/8.3/data; export PGDATA # if not set
date
echo "---------------------------------------------------------------"
echo "Indicating PostgreSQL for a hot Snapshot.."
echo "---------------------------------------------------------------"
touch $PGDATA/backup_in_progress
psql -c “select pg_start_backup('MyHotSnapShot');” postgres
echo "---------------------------------------------------------------"
echo "Deleting earlier snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad confirm shares select postgres snapshots select testsnap destroy
echo "---------------------------------------------------------------"
echo "Taking a new snapshot..."
echo "---------------------------------------------------------------"
ssh -l root amberroad shares select postgres snapshots snapshot testsnap
echo "---------------------------------------------------------------"
echo "Verifying the snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad shares select postgres snapshots show
ssh -l root amberroad shares select postgres snapshots select testsnap show
echo "---------------------------------------------------------------"
echo "Indicating to PostgreSQL Server that Hot Snapshot is done."
echo "---------------------------------------------------------------"
psql -c “ select pg_stop_backup();” postgres
rm $PGDATA/backup_in_progress
echo "---------------------------------------------------------------"
date

Restoring from a hot backup snapshot is also bit tricky since we need to decide whether we need to roll forward and upto what point. It is probably beyond the scope of my post and hence I will just do a simple one with all files in the wal arhcives.


First of all restoring with PITR requires a recovery.conf file which should contain the path to the WAL Archives:



restore_command = 'cp /pgwal/archive/%f "%p"'

Also since we will be doing rollforward with the archive files,  we need to remove all the old pg_xlog files, create an archive status directory, remove the postmaster.pid file and an artifact of our earlier script, remove the backup_in_progress marker file too.


(We will do the above steps also as part  of our restore script.)



A restore snapshot script file can look something like as follows:



#!/bin/sh
#PGDATA= /var/postgres/8.3/data; export PGDATA # if not set
date
echo "Restoring the database"
echo "---------------------------------------------------------------"
echo "Shutting down the database..."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA stop
echo "---------------------------------------------------------------"
echo "Restoring the snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad confirm shares select postgres select pgdata snapshots \
select testsnap rollback
ssh -l root amberroad confirm shares select postgres select pglog snapshots \
select testsnap rollback
ssh -l root amberroad confirm shares select postgres select pgtbs1 snapshots \
select testsnap rollback
#if there is reason to rollback walarchives then uncomment the next two lines
#ssh -l root amberroad confirm shares select postgres select pgwal snapshots \
#select testsnap rollback
echo "---------------------------------------------------------------"
echo "Make sure pg_xlog/* is empty ..."
echo "Make sure pg_xlog/archive_status exists..."
echo "Make sure postmaster.pid is removed"
echo "---------------------------------------------------------------"
rm $PGDATA/backup_in_progress
rm $PGDATA/postmaster.pid
rm $PGDATA/pg_xlog/*
mkdir -p $PGDATA/pg_xlog/archive_status
echo "---------------------------------------------------------------"
echo "Set recovery.conf for recover.."
echo "---------------------------------------------------------------"
echo restore_command = \'cp /pgwal/archive/%f "%p"\' > $PGDATA/recovery.conf
echo "---------------------------------------------------------------"
echo "Restarting the database.."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA start -l $PGDATA/server.log
echo "---------------------------------------------------------------"
echo "Database restored."
echo "---------------------------------------------------------------"
date


Of course the above scripts may not be best but gives an idea of how one can use snapshot features provided by Sun Storage 7000  Unified Storage Systems with PostgreSQL.





No comments: