SysAdmins creating software for SysAdmins.
BCDR – Business Continuity and Disaster Recovery – Part 3: Your Disaster Recovery Test Results Checklist
We recently wrote about disaster recovery testing and provided three tips for your next test. Before you read any further, we strongly recommend you check out that article to determine how prepared you are for your next round of testing.
Even if you say, “We’re ready for anything,” that may only be half-true. Why? Because many companies put significant time and resources into preparing for testing but come up short on generating disaster recovery test results. And without the ability to gather meaningful results, you’ve really only done half the job.
Seven Questions to Ask After Running a DR Test
Ensuring that you can generate useful disaster recovery test results is just as important as planning your DR test in the first place—maybe even more so. We’re not saying it’s a fun task. In fact, it’s a little bit like eating your vegetables—nobody really enjoys it, but we all know we’ll be better off in the long run if we choke them down.
We’re here to make it as easy as possible. Consider these seven questions to be your checklist for assessing the effectiveness of your next DR test.
- Were your goals clearly defined when you started the tests? We’ve all been there: you thought you had clear goals heading into a test, but once you got going, you encountered confusion about what to test and how to distinguish success from failure. It’s possible to get lured in by goals that look good on paper but don’t really address the most important priorities for your DR testing.
- Did your entire team understand their roles and expectations? In other words, did any participants encounter knowledge gaps along the way?
Look for something deeper than a surface-level answer here. Sure, maybe the person in charge of restoring the system felt fully qualified to do so. But they also had to work with two people from the network team, and neither of these employees understood the nature of the DR testing or their role in making it happen. A simple miscommunication or misunderstanding can cause critical delays that force you to miss your recovery time objective (RTO). Another common problem is that while everyone is capable of performing the duties assigned to them, some team members are unclear about when to complete their tasks. In this scenario, your business can lose precious minutes as a team member waits for “someone else” to do what’s actually his job.
- Did you meet your RTOs and RPOs? RTO and recovery point objective (RPO) are by far the most important disaster recovery test results. It’s not wrong to say that as long as your DR program meets your RTO and RPO, it’s a success.
But this is another spot where you must dig deeper than a surface-level answer. It’s possible to meet RTO and RPO by doing something off-script. If your team used a funky workaround to deliver on time, good for them—but can they repeat this performance next time?
Consider also whether any failures were partial or complete. If some element of the plan failed completely, it’s good to know that you’ll need to replace it as you update your DR plan. If an element failed only partially, try to determine how these problems forced your team to adjust the plan. By taking this information into consideration, you’ll be able to build a much stronger plan for next time.
- What unexpected things happened? It wouldn’t be a DR test without something weird happening. But don’t use that as an excuse to settle for less-than-optimal disaster recovery test results. Some unexpected things happen due to our own oversight. Perhaps your DR plan was missing a key element that made you kick yourself when you discovered it during testing. Nobody’s perfect, but you could have prevented that, right?
Maybe you neglected to read thoroughly the technical specifications on the equipment you were using. Again, that’s just human, but it’s something that shouldn’t happen more than once.
An even more common unexpected pitfall is when companies plan to restore to the same hardware they’ve been using—only to discover that the vendor no longer makes that exact model. We at Storix run into this all the time, which is why we designed our Adaptable System Recovery to let companies restore their backups onto practically any hardware.
Other unexpected occurrences are beyond our control. A normally reliable piece of equipment fails at the worst possible moment. Someone makes a mistake they’ve never made before. These are occurrences to document so that you can develop a reasonable Plan B against them.
- Did you document carefully? OK, so you ran your DR test and evaluated the results. Did you keep detailed records and produce a report? After each DR test, you should generate a report that you store away along with the results of the test. If you don’t keep records of these things, it’s like going to school with no report cards. How will you measure your progress?
- Did you schedule a time to correct problems and make adjustments? Don’t stop at documentation. Before you lose your momentum, take those disaster recovery test results and schedule a time to discuss what changes you need to make to improve your performance next time. It’s the easiest thing in the world to generate a long list of results and then file them away. The only real improvement comes when you make changes to your DR plan based on those results.
- Did you train your team on the changes and update your DR plan procedures? We can’t stress it enough: communication across the DR team is paramount. Even the smartest changes to your DR plan will be useless if you haven’t informed your entire DR team and provided them with the training they need to do things the new way. Make sure you put everything in writing, too. Otherwise, team members will simply revert to how they did things—less than effectively—in your last test.
Get Help Gathering Disaster Recovery Test Results
Not sure you’re generating the right kind of data from your DR tests? Or want some guidance in creating and optimizing your DR test plan? We’re here to help. When you’re ready to talk, give us a call at (877) 786-7491.