Backup Strategies? How about a restore strategy?

LAS VEGAS - JANUARY 07:  An ioSafe, Inc. Solo ...Image by Getty Images via @daylife
Recently we had a catastrophic failure of the LUNS that were connected to our development database cluster. Upon starting a restore process, we discovered that there was an issue with our backup process.

Backups are just the first step in a backup strategy. How often are the backups tested? Some would say this is part of a "Disaster Recovery" process or program. I think there a degrees of disasters.

Most DR programs that I am familiar with are designed for a catastrophic failure of a site, or a data center. There are other failures that should be addressed in a restore strategy. Losing a development environment can still cost the company many man-hours of work. If you calculate the dollar value of each hour worked, or not worked as the case may be, you will see the financial impact of not periodically testing your restores.

There are different types of backups, full, incremental, snapshot, and full operating system backups. When you are restoring to a point in time, and you are doing incremental backups, you have to have all of the incremental backups available. You may need additional storage available to test out the restore of a database.

Restores generally take a bit longer to do than an actual backup. Especially if there are multiple steps to the backup. If you need to get the backup media from off-site storage, this can take even longer. How long does it take to get a specific tape or CD from your off-site vendor? How often is the time in this SLA tested?

A Dataflow Diagram of backup and recovery proc...Image via Wikipedia
What is the business impact to losing a test or development environment? One may say, not much, but have you factored in the manpower cost to recovering that environment? How many people are involved in rebuilding the test or development environment that could be working on solving business problems with IT?

Does your backup or disaster recovery strategy include time and resources for periodically testing out restores of individual systems? If it does, what frequency is this done? If not, why not?

Related articles
Enhanced by Zemanta

1 comment:

  1. Very informative blog... This blog share helpful strategy for disaster recovery site. Thanks for sharing.