storeage snapshot management

Storage management with automatic periodic snapshots

A while ago, we introduced the ability to take snapshots of a drive. This is a very handy feature that allows you to quickly and efficiently save drive states even on live systems.

Thanks to ZFS, snapshots will only consume the delta between the current state and where the snapshot was taken. This means that if your original drive was 15GB and only 1MB of data has changed between the snapshot and the current state, the size of the snapshot would be 1MB. If you write another megabyte to the disk, the snapshot will grow by another megabyte.

Another beauty of this system is that any snapshot can be promoted (cloned) into a full disk drive. This means that you can create an independent copy that can be mounted on a server potentially on a different storage system entirely. As such, this forms the foundation for a storage management strategy (depending on your workload).

A word of warning

While using periodic snapshots can be a part of your backup strategy, it is unwise to rely on snapshots as your sole strategy.

There are also numerous situations where using these snapshots will not work, such as snapshotting a running database server. The snapshot functionality may still be useful on stopped database servers (to create a point-in-time restore), but again, it should not be your sole backup strategy.

Creating automatic snapshots

Using our Python library, automating snapshots is really simple. However, given that we need to store the CloudSigma credentials on the system that triggers the snapshots, we’d strongly discourage you from exposing production service credentials insecurely. If you want to run this on a cloud server for example, please make sure that it is shielded off from the rest of the infrastructure (such as using our network policies feature) and that is fully locked down.

After installing the Python library, you can download and run the script as follows:

$ wget https://raw.githubusercontent.com/cloudsigma/pycloudsigma/master/samples/snapshot.py
$ python snapshot.py drive-uuid my-snapshot

snapshot.py takes two arguments:

  • The UUID of the drive you want to snapshot
  • A friendly name for the snapshot

After you’ve manually created a snapshot and verified that it works (you can see this under the ‘snapshot’ section of the drive), we can now automate this.

The most suitable, and standardized way of running a task like this would be to the crontab (assuming you’re on a Linux or Mac OS X).

With the same user as you created the snapshot above run:

$ crontab -e

If you want to take a snapshot every night at 1AM, add the following line:

You’ll also notice that the script will log to a file named snapshot.log in the home directory of the user running the script.

Automatically purging snapshots

Since snapshots grow over time, you will likely want to delete these snapshots after some time. To solve this problem, we’ve created another script that can do this for you. The script is called snapshot_purge.py and takes two arguments:

  • The UUID of the drive
  • The number of days worth of snapshots you’d like to keep

For instance, if you want to keep 30 days worth of snapshots, you can simply run:

$ wget https://raw.githubusercontent.com/cloudsigma/pycloudsigma/master/samples/snapshot_purge.py
$ python snapshot_purge.py drive-uuid 30

You can of course automate this too. For instance, if we want to purge snapshots older then 30 days, we can add the following to our crontab (which will run at 1:30AM):

Wrap up

That’s it, folks. Using these two scripts, you will be able to automate your drive snapshots. If you need to snapshot multiple drives, simply add more of the snapshot.py lines to your crontab with different UUIDs.

We’re of course just scraping just the surface of what can be done with snapshots, but I hope this serves as a quick crash course in using snapshots for your storage management routines.

If you have more sophisticated data retention needs, you can hopefully reuse some of the code in the scripts above.

About Viktor Petersson

Former VP of Business Development at CloudSigma. Currently CEO at WireLoad and busy making a dent in the Digital Signage industry with Screenly. Viktor is a proud geek and loves playing with the latest technologies.

Leave a Reply