Disaster Recovery’s big brother Operational Recovery

5

June 20, 2013 by Kenneth Fisher

I’m going to get slammed for the title by someone but I stand by it. First some definitions.

Disaster Recovery (DR) is recovering a database (or part of one) in the wake of a “disaster”. Examples of a disaster: one of the drives goes bad, a database is corrupted somehow, the server goes down, the whole data center is under water, etc.

Operational Recovery (OR) is the recovery of a part of the database due to something that isn’t a disaster. This usually covers anything done in the normal course of business, but can also cover correcting for small errors that don’t impact your production service. Examples include refreshing a test or model office copy of a database from production, pretty much anything in test, the loss of a small number of rows in production where regular business isn’t affected. Typically you will be recovering a table or some rows of a table but sometimes a stored procedure, function etc.

Occasionally the lines between OR and DR can be somewhat blurred. For example someone forget a WHERE clause on a delete and deleted currently used data, someone deletes a row and because of cascading deletes half the database is now gone (true story), running the script that was meant for test in production (also, sadly a true story), etc.

So why do I call Operational Recovery the big brother of Disaster Recovery? And why am I bringing it up at all? I bring it up because OR is rarely discussed. However you can go on line and read literally thousands of articles, blogs, etc on Disaster Recovery. Discussions on why you should call it a Recovery Strategy rather than a Backup Strategy etc. As for “Big Brother”, in the last 20 years as a DBA of one type or another I have been involved in exactly 2 DR situations. And in one of those cases we had messed up and our backups were corrupt so we had to work without our planned Recovery Strategy anyway. At my current job we perform Operational Recovery on the order of once every month or so.

I should note that I take nothing away from DR. It is arguably one of the most important functions of an administrative DBA. However OR is also important. It is arguably as important as DR in its own way and used much more frequently. So how do we prepare for it?

If working with a small database, let’s say 100mb, then regular backups are frequently ok. Even so, if I’m going to go that route I would still probably want to perform one of the following options before making a change.

  • Full or differential backup (if in simple recovery)
  • Log backup (if in full or bulk-logged recovery)
  • Put a marked transaction in the log (also if in full or bulk-logged recovery)

The purpose of these would primarily be to have a mark in time for right before my change but in a pinch my regular backup strategy should be enough. To perform an OR of this type I would recover the database to my mark with a different location/name. Preferably still on the same instance of course. So if I’m recovering AdventureWorks2008 I’ll name it AdventureWorks2008_Recover. Then I can move whatever I need from the “Recover” version over to the “production” version. In this case “production” may be model office, test or even the dev environment. It’s the true version of the database.

So what do I do when I have a large database? Let’s say 500gb. Due to the length of time to recover and the space the “recover” version takes up, recovering to an alternate location probably isn’t an option. So now we have to take a different tack. There are three different considerations here.

  • Code
  • Schema
  • Data

Code is probably the easiest. Any form of version control works here. If you don’t have version control software (we don’t) you can either script it to a text file (make sure to keep a date) or store it in the database as codename_date. sp_myStoredProcedure_20130620 for example. Then if you have a problem with the code you have a backup copy of the previous version(s) that can easily be “recovered”.

Schema is next. Basically the same solution here. Either a) script out your old schema before the change or b) make a copy somewhere. Sometimes this isn’t even necessary. Adding a column for example is fairly easy to back out even without any form of backup.

And lastly Data. Data isn’t really all that difficult to back up, it is just potentially the most space consuming part of the whole thing. In fact if I’m updating good percentage of the database then I might just go with the backup process. But let’s say you are only changing a few rows, or even a few thousand. It’s far easier to just save the rows you are about to change/delete off into a separate table. That way if you have to change/add them back it’s easy enough to write a query to do it. I would however make sure that you don’t have any triggers updating other tables at the same time. If you do then you will need to capture the original version of those tables also.

The upshot of all of this is to make sure that you can back out of any change you make. There are lots of different ways to do this and sometimes you don’t need to do anything more than your normal backups. Make sure that you know when those times are, and when you need to do more.

About these ads

5 thoughts on “Disaster Recovery’s big brother Operational Recovery

  1. I liked the title and the intent of the article. I’m glad someone is talking about Operational Recovery. I’m glad there’s a good term I can use for it when I talk about it too.

  2. […] How will you back out if your change doesn’t work?! Always take a backup! There should always be a back out script to go with any change script! These are things that have been drilled into me time and time again. Sometimes by management, sometimes by bitter experience. If I’m changing a piece of code I make sure that I have a copy of the old version either in some sort of versioning software or worst case in a text file in a known location (at least to the other DBAs). If I’m changing data I make sure I have a current backup (if possible) or a copy of the data I’m changing. You will probably realize that I’m talking about Operational Recovery and in fact the beginnings of this post were the inspiration for my post last week “Disaster Recovery’s big brother Operational Recovery”. […]

  3. […] Disaster Recovery’s big brother Operational Recovery […]

  4. […] point in time recovery then take the bleeping log backups! One thing I should point out here is Operational Recovery. Just because you don’t care about point in time recovery doesn’t mean that the developer who […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 163 other followers

%d bloggers like this: