Testing, Testing, 1, 2, 3...

Disaster recovery plans should be part of any testing plan.

I do my research and testing within the confines of a test network. Sometimes it's a single Windows forest with a single domain and sometimes it's a multi-domain, multi-forest, multi-OS monster. Mostly it's not meant to exist for more than a few months, so I don't spend a lot of time worrying about keeping it viable in the face of natural disaster or even simple hard drive failure. If something in the test Windows network dies, it's not going to mean its not going to put me out of business. Can you say the same?

Do you have a disaster recovery plan for your Windows network? I'm not talking about simple data or even system backup here. I'm talking about your plan for completely recovering from the loss of your Active Directory infrastructure. Could you withstand the loss of one of your operations masters? How about the first domain controller installed in a domain? How long would it take to rebuild the DC? Would that time span be short enough to prevent you from major revenue loss?

There are two themes here: First, you should have a comprehensive plan for recovering from the loss of any part of, or all of, your AD infrastructure. Second, you should be able to do so without causing your business financial ruin.

First things first. What should you be doing today that might save your hide tomorrow? Here are three important steps: if you aren't doing them, you should be asking yourself why.

  1. Back up every DC, but make doubly sure you back up every operations role master, global catalog server and the first DC in the domain. You can only restore a DC from its own backup. You can create new DCs in the domain to take the place of those that serve no special role, and you can recover operation masters and rebuild a GC, but can you do so fast enough?
  2. Back up frequently. A backup older than the tombstone age set in AD is not a backup. A tombstone represents a deleted item. The tombstone exists so that an item can be replicated throughout AD, making sure that each DC eventually has deleted the item. If your backup is older than the tombstone age, you won't be able to successfully use it to restore a healthy AD back to your pre-disaster state.
  3. Practice recovery of each type of DC loss. If you don't have a test network to attempt this on, get one. It's a small investment that can mean everything.

There's a lot more to disaster recovery—and, more importantly, maintaining business continuity. If you've got some statistics on the time it took you to recover from an AD failure, let me know.

About the Author

Roberta Bragg, MCSE: Security, CISSP, Security+, and Microsoft MVP is a Redmond contributing editor and the owner of Have Computer Will Travel Inc., an independent firm specializing in information security and operating systems. She's series editor for Osborne/McGraw-Hill's Hardening series, books that instruct you on how to secure your networks before you are hacked, and author of the first book in the series, Hardening Windows Systems.

Featured