When designing or implementing high availability systems and planning for business continuity, backing up data plays a large role. Traditionally, data backup has meant taking a copy of your data and saving it in order to protect against disk failure, corruption or accidental deletion. Because you are essentially writing the contents of one drive or partition to another drive or partition, performing a backup can be extremely time consuming. As such, many backups may only be run on a nightly, weekly or yearly basis. When backups are run on a consistent schedule and offloaded to external or offsite data storage, data backups provide an excellent long term contingency and data protection plan.
But what about data protection in the nearer term?
Enter: the snapshot. A snapshot, generally speaking, is like a backup. Snapshots allow you to roll back your data or system configuration to an earlier point in time. Snapshots also let you recover accidentally deleted or overwritten files and quickly recover from a system crash. But unlike backups, snapshots can be taken in a matter of seconds without impacting performance or consuming copious amounts of disk space. As such, snapshots can easily be taken throughout the day (say, on an hourly basis) and data can be rolled back or restored nearly instantaneously.
How do snapshots do that? Is it magic? Does it make backups completely obsolete?
Not exactly.
How Snapshots Work
A backup is a standalone copy of a data set. While it may be encrypted or compressed, a backup contains within itself all of the data that existed at the point in time that the backup was created. A snapshot is a bit different.
Rather than creating a copy of the data, a snapshot works by keeping track of the data that has been changed since a file has changed. In this way, a snapshot is more like a collection of meta data than it is an exact copy of the file at a given point in time. In order to restore a previous version of a file using a snapshot, the software will simply “rebuild” the old file to its previous state using the meta data about the file.
It’s a bit hard to grasp that in terms of data. Try thinking of it like this:
Let’s start with a ham sandwich. This ham sandwich consists of two parts: white bread and ham. Now, let’s say we want to make that into a ham and cheese sandwich. To do so, we need to add cheese to the white bread and ham. But let’s also say that before we make the ham and cheese sandwich, we want to be able to go back to the ham sandwich. If we were doing a backup, what we would do is create two sandwiches: one with ham and white bread, and another with ham, cheese and white bread. If we were doing a snapshot of the sandwich, we’d do things differently. Instead of making two sandwiches, we would keep track of the state of the ham sandwich before making it into a ham and cheese sandwich. Let’s call that information “snapshot0”. We take that snapshot and then add cheese to our one ham sandwich and now have a single sandwich with three parts: ham, cheese and white bread. If we wanted to roll back to the ham sandwich, we could reference snapshot0, which tells us that the ham sandwich consists of just two of the three parts of our existing ham and cheese sandwich: ham and white bread. Now, let’s take snapshot1 of the ham and cheese sandwich and swap out the white bread for wheat bread. We now have three possible configurations of the sandwich, but unlike with backups, we do not have three complete sandwiches. Instead, we have all four components—white bread, wheat bread, ham and cheese—and a set of pointers that tells us how to compile two earlier versions (snapshot0 and snapshot1). At any point, we can reconfigure our sandwich from a wheat bread ham and cheese sandwich to a white bread ham and cheese sandwich or a simple ham sandwich. And we’ve done so without having to keep six different slices of bread, three slices of ham and two slices of cheese.
So, in a sense, a snapshot works by keeping all the ingredients you might need to create a certain version of a file and the instructions for which of those ingredients to include without duplicating any ingredients. This is how a snapshot saves space.
Can Snapshots Replace Backups?
Although snapshots can serve many of the same purposes of a backup, many would argue that a snapshot doesn’t completely replace the need to make backups. The main point is that snapshots don’t protect against media failure. If an entire disk drive fails, you won’t be able to restore it using snapshots, since the data that snapshots use to rebuild the earlier versions of the files is stored on the drive itself. Likewise, if part of the data becomes corrupted, it’ll impact all snapshots that draw on that data. Using our example above, if the cheese on the wheat bread ham and cheese sandwich gets moldy, you won’t be able to make the white bread ham and cheese sandwich or the wheat bread ham and cheese sandwich.
There are, of course, ways to combine snapshots with other technologies and solutions to create a balanced and robust backup system. One simple way is to continue to do backups overnight in addition to implementing an hourly snapshot system. But there are also proprietary technologies, such as NetApp Syncsort Integrated Backup (NSB) that take advantage of the strengths of both snapshots and traditional backups.
The important thing to keep in mind is that snapshots alone won’t replace a traditional backup. Make sure you plan for all of the pitfalls of data storage, including physical drive failure, natural disasters, virus attacks, system corruption and other causes of data loss.
Are Snapshots Backups?
Comments
No comments yet. Sign in to add the first!