3DR -- a new way of thinking about data recovery

30.05.2006
Q: I recently heard you speak about 3DR as a new way to think about backup and recovery, can you elaborate? -- J.M., Kansas City, MO.

A: You did hear correct, and I'm woefully behind on the paper that lays it all out, but let me give you the top line points.

3DR is a way of thinking about data recovery - at multiple levels, from local through true disaster. It is a construct for putting your brain to work around what data should have what protection levels and for how long.

3DR stands for Data Recovery, Disaster Recovery, and Doomsday Recovery. The first two imply that recoveries, for any reason, occur from disk. The last one from tape. We want you to never have to use the last level.

In order to grasp the concept you should first try to forget everything you know and assume about backup and recovery -- specifically, the way you do backup and recovery today.

Assumptions:

1. All recoveries happen from disk, as close to the application as physically possible.

2. All remote recoveries, required due to major outage at the primary facility(s), occur from disk.

3. Tape is used to create a certain confined number of deep archived backup copies of unique data sets, and is hopefully never, ever needed.

So in the first stage -- the Data Recovery stage -- we backup to a disk-based system. We want everything backed up to that system, and we want to keep stuff there forever if possible. It may be a simple array that becomes the target of our existing backup/recovery software, such as a block-based RAID array or a NAS system. It may be a VTL (virtual tape library), which is a disk box that emulates a tape device(s) to the backup software. Here you are limited only by size and money perhaps. That's why technologies like data de-duplication are so huge here -- if you only truly write unique blocks and/or files, you'll see 20:1 or greater "compression" capabilities. That means you can legitimately keep pretty much all the data ever created in one nice "recovery pool," forever. Any time you need to perform a recovery it is done from this disk pool, on-line, really quickly. Other features like CDP (continuous data protection) are also good here -- and everywhere else -- so that you can add increased granularity to the recoverable data.

The second stage, Disaster Recovery, simply implies that an exact replica of the first stage is remotely shipped off-site to another location. Again, having technology like data de-duplication makes this more than feasible and more than reasonable from an expense perspective. I can have all of my primary sites replicate their Data Recovery staged data to a single, or multiple, disaster sites so that now I can recover off of disk even if I lose a facility. Seems straight forward enough.

The final stage, or Doomsday Recovery, is where we spin to tape. The way we work today doing incrementals during the week and full backups on weekends, means that over time we will have tons of copies of the exact same data, on thousands of tapes. This is not only a waste, it makes recovery even harder. Instead, we're going to re-think what it is we are trying to accomplish. Since 99.7%+ of all recovery actions will occur on the previous two stages, this is really only for an outrageously small percentage where we would need to use this tier for recovery. Therefore, it makes no sense to have more than say, four copies of any unique data object. Once we have four copies on four tapes, we flag the data so that the backup system no longer makes any more backups of that data. In essence, we archive the data and remove it from the backup process. Doing this will most likely result in a 90%+ savings in media cost alone, which most likely would justify buying a whole lot of cheaper/slower disk based systems to keep all your real recovery data on.

The economics and the technologies required to do this exist now. This isn't fantasy. The most difficult element to come to grips with is mental - you don't have to keep doing things the wrong way just because that's the way we've always done them.

I'll get a full paper out on this soon, with Heidi Biggar's help, and we'll put a model in it that you can fill out to see the economic, performance, and overall impact a change would mean, by different products and what each different technology will mean to your operation. Soon. I promise.

Send me your questions -- about anything, really, to sinceuasked@computerworld.com.

Steve Duplessie founded Enterprise Strategy Group Inc. in 1999 and has become one of the most recognized voices in the IT world. He is a regularly featured speaker at shows such as Storage Networking World, where he takes on what's good, bad -- and more importantly -- what's next. For more of Steve's insights, read his blogs at thtp://esgblogs.typepad.com/steves_it_rants/.