Creating a Business Continuity Plan
Tips to Create a Business Continuity Plan
What if a major component of your company's computer systems failed today? Do you have a plan in place to allow business to continue, or would you be out of luck until it gets repaired? A Business Continuity/Disaster Recovery (BCDR) plan is an important part of any IT security management program because it allows operations to continue the event of a failure, whether it be a failure of an individual component, a whole system, or your entire data center.
Create a BCDR Committee and Assign Responsibilities It's very difficult for every person to create an effective business continuity plan on their own because represented, from IT to HR to Operations and upper management. Assign responsibility for each business unit to the committee representative and task them with creating plans and processes for their part of the organization.
A plan and guide the team; this is usually the security administrator or a member of the executive team. Create regular status meetings for the committee to meet and discuss each member's individual progress. Identify Failure Points in the Organization Once a committee is formed, a key responsibility of each member is to identify Single Points of Failure (SPOFS) in their area of the business.
A SPOF is defined as one individual system, component, or process that, if it fails, stops operations either in that department or halts production of the entire company. For example, the IT representative may determine that the company's web server could crash and result in lost revenue as all e-commerce transactions fail to process.
An Accounting member may find that their accounting application could fail and cause all payroll and AP/AR functions to cease, resulting in delays of payments sent or received. An Operations representative might discover that a failure of the air conditioning system in the data center room would cause all computer equipment to overheat and shut down, bringing business to a grinding halt. This step usually takes the longest as each delegate has to comb through their entire area of responsibility and examine the smallest details.
If possible, have committee members examine areas outside their normal responsibility to review findings from a different perspective and uncover items that may have been missed. Ensure that members are documenting all of their findings as they go along.
Create a Contingency Plan for Each Failure Point
Now that every possible failure point has been identified, reviewed, and documented, it's time to create a backup plan for when those items fail - after all, it's not a matter of contingency plans, especially for IT systems. While it may seem like an unnecessary expense, the fact of the matter is that having backup equipment is absolutely critical to ensure ongoing operations of the business in the event of a system failure.
For example, if your website is running entirely off of one individual web server, you should invest in a second server that could be brought online quickly if the primary one goes down. (Ideally, both to distribute requests evenly across them, which reduces the likelihood of a failure in the first place). IT systems are usually easier to plan for since most of the time a BCDR plan just involves having spare equipment to swap in. Operating procedures can often be temporarily converted to paper records if a computer system goes down, allowing business to proceed in the meantime.
Facilities can be more difficult to plan for since there is usually not enough budget or justification to invest in a separate building or warehouse that just sits there until it's needed. If the company cannot afford to purchase another building, you may choose to simply accept the risk of a facility going down for a period of time.
Test the BCDR Plan
After identifying single points of failure and creating contingency plans for each one, it's time to test those plans. Realize that it's one thing to have some paperwork stating what your plan is, but it's another thing to have actually tested your plan and made sure that it works. Set aside some time outside of normal operations and simulate a failure to see whether your process goes according to plan. For IT systems, simulate an unexpected shutdown and see if you're able to bring the spare equipment online and have it take over - and if it worked, record how long services were completely down before it came online.