The Oxford English Dictionary defines backup as “The procedure for making extra copies of data in case the original is lost or damaged” and “An extra copy of data from a computer”. The word backup in this meaning was first used in 1951, when the first computer was built. This computer, called the UNIVAC (Universal Automatic Computer), had its own internal storage, but required data and programs to be loaded into it via punch cards. These punch cards can be considered the first data storage devices for backups. Anytime a program or dataset needed to be loaded, it would be read from the punch cards, similar to how a restore functions.
Over time, as computer storage devices have changed, so too have the backup devices. After these first punch cards came magnetic tapes, then hard drives, floppy disks, optical disks, and flash drives. With the advent of local area networks in the late 1970s, backups were no longer performed to a storage target directly connected to the computer. Backups were able to write over networks to the backup medium somewhere else. With the emergence of the Internet in 1990, online backup services grew and grew. Today, The Cloud is a popular backup target.
Shortly after computers were invented, someone decided that they needed to be backed up. Traditionally, this involved an agent installed on the client. That agent then used the systems resources to look at all the files on the system and then move the files to the backup environment. This could consume a lot of resources. Backups needed to be performed at night, when the systems were idle, to prevent production outages. The backup data would travel over the network, and large systems would take hours (or days) to finish a single backup. How many Backup Administrators have fielded a call from their Network Administrators demanding that backups be stopped to allow network bandwidth for other tasks?
For the most part, backups have gone relatively unchanged for 60+ years. A computer system uses an agent to copy files from its primary storage location to a backup storage location. Be it punch cards or magnetic tape, flash drives or over the network, an agent was used to copy that data, using the resources of the host system.
How Virtualization Changes Backups
A virtual machine looks and acts very much like a regular, physical computer systems. Many times, users will never know that the system they are using is a virtual machine. And this holds true for the applications. A virtual machine can still be backed up using the traditional backup agent. There are a few drawbacks to this method, though. First off, if there are multiple agent backups running at the same time, one the physical host, each virtual machine backup will be rate limited by the bandwidth of the host. This is where some of the underlying technology of virtual machines benefits modern backup procedures.
Each virtual machine is made up of one of more virtual machine disk files. These disk files reside on some kind of storage – either SAN-attached, Network attached NAS, or locally attached disk (DAS). When using SAN attached storage, the same storage can be presented to the backup host. In these cases, the backups can be made to run off the SAN, eliminating performance issues and resource bottlenecks on the virtual machines and their hosts. This results in true off-host backups with a boost to the speed of each backup.
Another benefit that virtualization brings is change block tracking. This allows the physical host to monitor and record blocks that change within each virtual machine. Rather than looking through an entire directory tree to find files that have changed, the changed blocks are known, and those are the only ones that need to be backed up. By backing up only these changed blocks, the time a backup takes to run is dramatically reduced.
With the proper backup software, the right hardware configuration, and up to date virtualization platforms, backups of hundreds of virtual machines can be accomplished in a matter of minutes. What used to take hours during the night, can now be accomplished in minutes, anytime during the day. Can you say your virtual machines are protected like that?