Plan, practice, and then keep on practicing! It sounds so simple, but those are the keys for IT teams who have to respond to a disaster delivered by Mother Nature—whether it’s a flood, fire, earthquake, or a major storm such as a hurricane or tornado.
First, you need to develop a plan, and then you need to practice your response. Practice helps you identify potential holes in the plan so you can consider revisions. Practice also helps you act calmly, if and when a real natural disaster strikes.
Your disaster recovery practice runs should be made at least once per year, but twice or even once per quarter is better. As the technical and logistical aspects of running your IT infrastructure change, so too will your disaster recovery plan. When changes occur, consider the implications to your project, and if you practice regularly, you will have a fallback mechanism to catch anything you forgot to consider.
When documenting the plan for your on-premises IT infrastructure and the role your technical resources will play in the event of a disaster, consider your technology service providers as well. This includes cloud platforms that host parts of your IT infrastructure as well as your broadband and telecommunications providers. If your infrastructure integrates and exchanges data with business partners, they need to be considered too.
It’s critical to discuss and understand the disaster recovery plan for each of your service providers and partners, where your plans and their plans overlap, and how you depend on each other to restore services. Consider what needs to take place if a disaster hits both you and a provider or partner at the same time, as well as a catastrophe that takes-out just one of your facilities.
A Checklist to Formulate Your Recovery Plan
To help you formulate a recovery plan for natural disasters, a good place to start is the NIST Contingency Planning Guide. While designed for agencies who manage federal information systems, the guide provides an extensive framework that gives you a complete checklist of issues to consider.
You can check out the guide for all the details, but here’s a high-level overview of nine major components for formulating a plan to respond to natural disasters:
1. Policy Statement
To ensure everyone understands your organization’s contingency planning requirements, the plan must begin with a clearly-defined policy. The policy should define your overall contingency objectives and establish the organizational framework and responsibilities for system contingency planning.
2. Business Impact Analysis
Business impact analysis helps you to characterize system components, supported business processes, and their interdependencies. You can also correlate systems with critical business processes, and based on that information, characterize the consequences of a disruption. The results of the analysis will help you determine contingency planning requirements and priorities.
3. Business Processes and Recovery Criticality
Information systems often support multiple business processes, resulting in different perspectives on the importance of system services. To understand the impacts of a system outage, identify and validate processes that depend on or support the system. Then analyze the identified impacts of the processes in terms of availability, integrity and confidentiality. To determine the recovery criticality, discuss the supported business processes and with the process owners to determine the acceptable downtime if the given process or its system data are unavailable.
4. Resource Requirements
Natural disaster recovery efforts require a thorough evaluation of the personnel resources that are necessary to resume business processes as quickly as possible. Ensure that information system resources are identified along with their primary and backup contact information. It’s also key to determine when any of the required resources are on vacation, out sick or otherwise unavailable. For these situations, identify the back-up personnel resources who will step in.
5. System Resource Recovery Priorities
Recovery priorities can be effectively established by considering business process criticality, outage impacts, tolerable downtime, and system resources. The result is an information system recovery priority hierarchy.
6. Backup And Recovery
Backup and recovery strategies should address disruption impacts and allowable downtimes. A wide variety of recovery approaches may be considered, with the appropriate choice depending upon the incident, the type of system, impact level, and the system’s operational requirements. Specific recovery methods should be considered and may include contracts with offsite storage and equipment vendors. In addition, technologies such as redundant storage arrays, automatic failover, and mirrored systems should be considered.
7. Equipment Replacement
If an information system is destroyed or unavailable, replacement hardware and software will need to be activated or procured quickly. Your organization can prepare for equipment replacement by establishing emergency-replacement contracts with vendors, purchasing and storing replacement gear at an offsite facility, or devising a plan to utilize hardware and software located at other office locations.
8. Roles and Responsibilities
This step involves designating appropriate teams to implement the recovery strategy. Each team should be trained and ready to respond in the event of a natural disaster requiring plan activation. Recovery personnel should be assigned to specific teams that will respond to a disaster event, handle the recovery, and return systems to normal operations. Recovery team members need to clearly understand their team’s recovery goals, the individual procedures the team will execute, and how interdependencies between recovery teams may affect overall recovery strategies.
9. Training and Testing Exercises
Your recovery plan should be maintained in a state of readiness, which includes training personnel to fulfill their roles and responsibilities within the plan, validating plan content, and testing systems to ensure their operability in the environment. The effectiveness of the information system controls should also be assessed, and testing events should be conducted periodically.
Consider the Cost of Doing Nothing
As you work your way through all the steps above, as the NIST guide recommends, check to make sure your strategy can be implemented effectively with the available financial resources. The cost of each type of alternate site, equipment replacement, and storage option under consideration should be weighed against budget limitations.
Also determine known recovery planning expenses, such as alternate site contract fees, and those that are less obvious, such as the cost of implementing a recovery awareness program and contractor support. The budget must also be sufficient to encompass software, hardware, travel, shipping, testing, training programs, labor hours, and contracted services.
Yes, creating a plan to recover from a natural disaster will require a significant investment of dollars and effort. And it can be onerous to keep planning and practicing for a natural disaster that may never come.
But failure to develop a recovery plan can be a disaster in itself. If your business is not able to recover from a natural disaster quickly, it could be mean the end of your business. That’s a much bigger cost no one can afford!