Don’t have a negative attitude, not everything is bad in continuity management
Let’s recognize that every time we speak of continuity management we always think about bad things, disasters, serious interruptions in service, widespread outages that cause major downtime. If you are one of those people who think like this, let me offer you a very different view (and a more positive one) of continuity management...
Of course, one very important part of continuity management (ITSCM) is drawing up continuity plans, contingency plans, recovery plans, or whatever you prefer to call them. The DRPs (which is what I call them) are one of the most eye-catching deliverables of the processes, but they are also the gloomiest part, because they are the negative part: what to do when things have gone wrong.
The problem is that ITSCM does not only handle what to do when things have gone wrong; it also has a proactive and preventive part (largely forgotten), which handles analyzing risks and threats to which our services are exposed and adopting measures (supported in other processes) so that the much-feared downtime does not occur. These preventive tasks that help improve availability of the services are also within the remit of the continuing management process.
Before diving into preparation of recovery plans, we should conduct a detailed business impact analysis (BIA) of a service outage , and based on this assessment, determine whether it is necessary to implement preventive risk management which will take care of:
- Determining which risks could affect each service
- Determining, for each risk, the real impact on each service (not all risks will have an equal impact on all services)
- Discovering which threats could cause the risk in question to occur
- For each threat, analyzing what is the real probability that the threat occurs (and that the risk that will have an impact on service becomes a reality)
All this prevention is also within the remit of service continuity management, which deals with the “before” scenario (so that the service does not crash), in order to prevent reaching the “after” scenario (that the service has already crashed).
In addition, all these preventive activities are closely related with other processes, since if we consider the matter carefully...
- ....We are performing tasks for continuous service improvement (CSI)
- ....We are detecting hidden difficulties in the infrastructure (Proactive Problem Management?) (Detection of Known Errors?)
- ...We are adopting mitigating actions to reduce the impact of risks and/or the probability of threats (Change Management?)
An oft-used example is that Continuity Management is like an insurance policy that you take out just in case things go wrong or just in case we get sick, but I think that this is a misguided simile (or at least incomplete), since it is really like health insurance that helps us prevent sickness by attending regular preventive checkups, and that even if, in the end, we get sick, it helps us get better (but we should not think only of this second part of the insurance policy, since the first part is really much more important).
I hope that you enjoy it
Jandro Castro