Business Continuity Planning (BCP) is the deliberate assessment of the risks to the organisation’s people and information – and then taking steps to mitigate those risks should they occur.
Business Leaders may be tempted to think ‘it would never happen to us’ or allow survivorship bias to cloud their risk appetite. But, without a Business Continuity Plan, you will have to ‘make it up as you go along’ if the worst should happen. This will cause things to take longer as you will have to invent the process, and then follow the process in the midst of a disaster. According to FEMA, 90% of businesses fail if they don’t re-open quickly enough following a disaster. Time is of the essence.
Information Security is concerned with the preservation of the Confidentiality, Integrity and Availability of the business’s data. Without a BCP the Availability is at risk, and probably the Integrity as well. For this reason, Information Security Managers often have a leading role in developing the continuity plans for many organisations.
This week, four of the data centres of Europe’s largest hosting provider went offline following a massive fire in one of their computer rooms. Their CEO Tweeted to his clients: It’s time to invoke your continuity plans. Disasters, like global pandemics, really do happen to people like you.
Disaster Recovery Planning is closely related to BCP and the line between the two varies from organisation to organisation. However, in principle BCP is the high-level strategic approach and Disaster Recover is the detailed technical implementation and procedures to follow on the day of disaster.
In this article we will summarise how to develop a Business Continuity Plan.
Scope & Planning
Just like any project in your business, a Business Continuity project needs a clear scope and sponsorship from a senior leader who will agree to allocate the resources and budget. The most significant cost of Business Continuity Planning is almost always time from the people involved – and without senior management sponsorship the project is unlikely to succeed.
Understand the Business: Business Organisation Analysis
The first step in the Scoping phase is to review the Business Organisation in order to identify all the departments, teams and people who should be involved in the creation of the Continuity Plan. This should include:
- Operations teams who run the core business services
- Support teams such as finance, IT and facilities
- Security teams who guard the premises who may be the first onsite to a physical disaster
- Senior Managers and any key people essential for the survival of the business
The key output from this exercise is a list of all the teams or people who really need to be involved in the creation of the Business Continuity Plan
Who’s on the bus: BCP Team Selection
Buy-in from the teams that will have to do the work to create the continuity plan, and then follow it is essential for the success of the project. Their local expert knowledge is necessary to ensure the plan is correct.
When picking the core team members, also include:
- Representatives for each operational team who provide the core business services
- Technical subject matter experts from IT and Cyber Security teams
- Representatives from other business units identified and essential during the initial Organisational Analysis exercise
- Compliance and legal teams who can ensure any regulatory obligations can be met through the continuity plan
- Public Relations / Marketing teams who will need to communicate to the media, customers and suppliers during a disaster
- Senior Management who can set vision, allocate resources and set priorities
How Much? Planning the Resource Requirements
There are three phases in the creation of the Business Continuity Plan and the resource costs for each need to be estimated and approved:
Development
This is the obvious first focus, as it covers the effort to create and document the Continuity Plan. This will predominately consume time from the BCP team members and other experts in the business who need to contribute.
Testing, Training & Tweaking
Once the BCP has been created it needs to be tested – so you can be confident that it works, staff need training – so you can be confident that the plan will be followed and then, no doubt, these two activities will identify parts that need adjustment and tweaking.
Implementation
When disaster strikes and the BCP team decides to invoke the Continuity Plan, it is likely to consume significant time and resources – possibly the focus of the whole organisation while it is being implemented.
I am the Law: Legal & Regulatory needs
Compliance and Legal goal posts are shifting all the time. It is likely that the efforts to develop the BCP will help meet some compliance obligations with little or no incremental cost. Legal obligations to clients through contracts and Service Level Agreements also need to be considered as they may impose targets for the resumption of service in the event of a disaster. It is important that the legal and compliance teams are involved throughout the development of the BCP.
Business Impact Assessment
The BIA is the heart of the project and where most of the hard work takes place. It starts with:
First things first: Identify Priorities
There are certain activities that are essential to the survival of the business, and others that could be turned off for three months and few people would even notice. Step one of the BIA is to identify all the priority activities in the business – and this is why it is helpful to have representatives from across the business in the team.
All those priority activities should then be placed into a prioritised list so their relative importance can be seen.
Now for each activity, identify the assets that are needed to deliver that activity – this will include IT systems, data, people. Those assets are the focus of the continuity plan.
For each asset assign it a monetary value – the asset value. This value is not the revenue the asset generates, but the cost to replace it.
Next for each asset define the maximum time you can survive without it – known as the MTD (Maximum Tolerable Downtime). This may be several hours, days or weeks depending on the nature of the asset and the business activities it contributes to.
Finally, for each business function decide the Recovery Time Objective (RTO) – that is the target time taken to recover the function by following the continuity plan. The choice of RTO will affect the cost of creating the solution- the less time taken to recover the function the more money usually needs to be spent on the recovery solution. For example, the MTD of the business’ accounting system may be 3 business days – after which point cashflow is affected because invoices are not being sent out for payment. The Business Continuity Team may agree an RTO of two business days meaning the plan will be designed to get the finance team back up and running within 2 days. The RTO should be less than the MTD.
What could possibly go Wrong? Risk Identification
Identifying the different risks to the organisation’s assets is the next step in the Business Impact Analysis. For each asset, brainstorm the risks that could impact it.
This will include both man-made events:
- Service provider outages (Cloud providers, SaaS systems etc)
- Key supplier failures
- Transport failures
- Building problems
- Power outages
- Fires
- Civil unrest, terrorism
And Natural events:
- Storms, blizzards
- Pandemic, health emergency
- Flood
- Lightning strikes
How often could it happen? Likelihood Assessment
The chance of any given risk happening in a year (known as the Annualised Rate of Occurrence) is now needed for each risk. For some risks, like the chance of a flood, there may be government data available for the land where the office is built. For other risks the members of the BCP team may have to make a professional judgement based on their own experience or company history. An ARO of 1 means the risk is expected to happen once a year, 0.1 means it is expected to happen once every ten years.
So what? Impact Assessment
The Impact Assessment is the heart of the Business Impact Analysis and it produces helpful quantitative figures which can inform business investment decisions.
There are two formula that drive the Impact Assessment, the first is the expected loss from a single occurrence of a risk. This Single Loss Expectancy is the factor of the asset value and the Exposure Factor. The exposure factor is how much of the asset value as a % you expect to lose as a result of a risk happening. Not every risk would result in the 100% destruction of the value of an asset. For example a flood may affect 30% of the building’s value so the Single Loss Expectancy of a flood risk for the building is 30% of the £100k building value i.e. £30k.
The second key formula uses the Single Loss Expectancy and the Annualised Rate of Occurrence to produce a final figure: the Annualised Loss Expectancy:
ALE = SLE * ARO
If the annualised rate of occurrence for a flood of the building is 0.1 (i.e. once every ten years) then the Annualise Loss Expectancy of a flood = £30k * 0.1 = £3,000.
This is helpful because it provides a means of measuring the economic sense of mitigating the risk. If installing flood defences would cost £15,000 per year then it does not make sense to do it as the annual risk to the business is only £3,000. If the cost of the control is higher than the cost of the risk then it probably is not sensible to pay for the control.
However, there may be qualitative factors that need considering as well- and here the senior management members of the BCP team will weigh in. If, for example, the business provides flood defences to its customers the management may decide the cost to reputation in the event of a flood in the company headquarters means it makes sense to install the flood defences anyway.
Let’s do this: Resource Prioritisation
The BIA phase ends with the process of allocating resources to a prioritised list of risks. It makes sense to address the biggest risk first, and the smallest risk last. So, the Annualised Loss Expectancy (ALE) previously calculated for each risk provides a straightforward means to prioritise the risks that need addressing – from highest ALE first down the lowest ALE last.
But before you dive right in, pause to consider any qualitative considerations – such as the reputational risk of a flood discussed above, and use these to adjust the prioritised list so it truly reflects what is important to the business.
Continuity Planning
During the continuity planning phase, the BCP team develops the plans to mitigate each risk identified in the BIA phase.
However, not all risks are equal and so the BCP team first needs to define their means of deciding which risks will be addressed and which risks will be allowed to remain. This could be as straightforward as defining a cut-off for the Annualised Loss Expectancy: ‘we won’t address any risk with an ALE of less than £10,000’
Once the list of risks has been paired down the work can begin to mitigate the various risks and protect the assets of the business.
When devising the risk mitigations, don’t forget that the most valuable asset of the business is the people who work in and around it. Save the people first, then save the servers.
The BCP should address the following areas:
- Keeping People safe
- Protecting buildings and facilities
- Hardening Infrastructure against failure
- Use alternatives (such as back-up sites) where this is simpler or cheaper
Approval and Implementation
The final phase brings the Business Continuity Plan to life, first by documenting it and getting endorsement from senior management to ensure it happens and everyone buys-in to the process.
Documenting the Plan
The Business Continuity Plan needs to be documented and stored in a way that can be found and used when all other IT systems and communications links are not functional – as may happen in a disaster. This almost always means putting it on paper – in multiple locations.
Training
Education is vital for the success of a BCP. During the chaos of an emergency, everyone needs to know where to easily find the instructions that they need to follow.
Testing
The plans and procedures developed in the BCP need to work, and you can only be sure of that if they are tested. For some procedures it may be possible to build the BCP activities into normal operating procedures so that they are used in daily operations. For example, processing may be routinely switched to the backup data centre every month as part of the routine patching procedures. Thus, if an emergency failover is ever needed, all the staff are well versed in the procedures that need to be followed.
Tweaking
The Business Continuity Plan is a living document (or series of documents) and needs individuals to take personal responsibility to keep their sections up to date and correct. If the BCP gets stale and fails to reflect changes in the organisation, then it may fail to provide the expected mitigations in the event of a disaster or even a minor risk causes much greater damage than expected.
“We were very impressed with the service, I will say, the vulnerability found was one our previous organisation had not picked up, which does make you wonder if anything else was missed.”
Aim Ltd Chief Technology Officer (CTO)