The title pretty much sums up planning a disaster recovery scenario. As a mentioned in a previous post, my colleagues and I are planning a DR test. We basically have to document every restore procedure and put together a set of materials that are required for a successful test. Saying that I have been stressed out lately is an understatement. Gearing up for and actually performing the steps required after a fire or bomb, etc. puts anxious thoughts in my mind. All of these have to come together within 48 hours or it is considered a failed test. I have been working on three key areas:
Backups

A DR test is really a kick in the butt for backup reliability. When this project became a priority our backups were performing at an acceptable level for user error restores or the occasional server crash, but not even close for DR recoverability.
I have been tasked with getting the backups up to par so we can restore completely during the test. Now, to do this I needed to fix the small errors that cause backups to fail and tweak the windows to allow enough time for them to complete, but it really came down to knowing the performance impact of the processes that are run on each system and the backups themselves.
My company is one that actually has a higher overall resource need during non-work hours. We run so many automated processes at night that working backups around them becomes an art. In this case, a lot of communication with the process owners and developing some sort of schedule agreement is key. I met with the appropriate people and created a makeshift gannt chart to help with the visualization. Our backups are now performing with an almost 100% completion rate. There are still times when I need to manually re-run a job due to some scheduling conflict or system hiccup.
Workstation Disaster Recovery Image

I debated myself for quite a while on how I was going to accomplish this particular task. We have so few computers at our company (>100) that a workstation image wasn’t something that I had created as of yet. I needed an image that would work on all different types of HALs, have all of the required software and configure itself based on the computer’s hardware with little to no user interaction.
I ended up choosing Windows 7 simply because it was the easiest OS to achieve hardware independence, however, software compatibility was a concern. I played tug of war with the operating system for several days but I was able to get all of our required software functional on the image. Surprisingly, all it took was either running the installer as Administrator (even though I was already an Administrator) and/or running the installer in Windows XP compatibility mode.
The last bit, auto-configuring the OS, was probably the most fun. Thanks to the wonderful tool, Automated Installation Kit(AIK), I was able to create an answer file that provided information such as Product Key, Registration Information and Domain Addition. The only thing the user has to do is login and activate windows once the image is deployed.
Backup Server and Virtual Machine Restoration

Who/what backs up the backup server? This is a valid question that can be easy to overlook, especially for those that are doing a DR test for the first time. I have not seen all backup software under the sun but I believe the backup server can be backed up just like the client servers. Local drives and system state can be snapshotted just like everything else.
In my case, however, things have to be done a little differently. Symantec’s licensing model segregates every single little feature in Netbackup and we just so happen to not have the Bare Metal Restore license installed. Instead of trying a OS build from scratch and then a restore of the system state over top, I just elected to do everything from scratch. Everything except the catalog!
Netbackup stores all information about backups and policies in its catalog. All Network Administrators should be taking a Catalog Backup along with all other backups. After I get the new backup server installed, I will simply restore the catalog and will be ready to begin restoring my virtual machines.
Since we are running a VMware environment, we take advantage of Virtual Consolidated Backup to do our backups and restores of virtual machines. VCB integrated pretty nicely into NetBackup and allows you to do the entire VM restore procedure right from the Restore utility. NetBackup with VCB simply restores the virtual machine VMDK files to a mount disk and then fires up VMware Converter to move it into the Virtual Infrastructure Environment.
The Weekend From Hell

This weekend is our Disaster Recovery test. I lovingly refer to it as the weekend of hell, of course! My supervisor and I will be working from 4 PM Saturday until 4PM Monday to bring up our entire environment. I will update next week with the results!
I feel no pressure at all
Related posts: