Keeping Your Airline IT Resilient

Posted

A number of major carriers have suffered high-impact IT events in the past several months. Estimates of losses in these cases have exceeded £100m. This is on top of (no doubt significant) remedial costs, reductions in share price and reputational damage.

Such high-impact events are, in theory, unlikely to occur—the result of a series of unlikely events which when taken together have a catastrophic impact. Unfortunately for corporates, the probability of a high impact IT event is increasing. This is partly due to the increasingly interconnected and complex nature of IT infrastructures but also due to heightened cybersecurity risks. Failures tend not to be not localised to a particular geography or business but have global reach.

We advise airlines to consider and revisit their current business continuity and disaster recovery (BCDR) arrangements. In our experience, the reality of BCDR arrangements often falls below the stated requirements or capabilities of such solutions, whether provided by third-party IT providers or in-house.

Even if a BCDR arrangement is expressed as “hot” or “active/active” (which implies efficient and rapid fail-over in the event of a disaster), these arrangements are frequently implemented on narrowly defined basis. For example, while secondary IT infrastructure might be available and functioning in the event of a disaster, the airline’s complex business applications may not function practically on this secondary infrastructure.

Why is this? The investment required to establish a true, close to fail-safe BCDR arrangement is high in terms of level of effort and cost, and frequently requiring the cooperation of application teams. Quite simply, some organisations take a chance such an event will not occur—a risk perhaps not accurately understood by anyone other than those individuals intimately familiar with the airline’s BCDR arrangements.

A detailed review of BCDR arrangements would amongst other things entail:

  1. In addition to testing that BCDR infrastructure is available and operational, can the test determine if the airline’s business critical applications will operate to acceptable service levels on the secondary infrastructure?
  2. Does the BCDR solution allow the airline’s applications to interface with each other and, critically, interface with off-host systems such as those provided by alliance or code share partners and key third parties such as logistics providers?
  3. Are recovery time objectives and recovery point objectives sufficiently defined and “fit for purpose”? In particular, does the BCDR solution allow cutover with minimal impact on data currency and accuracy? If the BCDR solution does not result in access to up-to-date data (data synchronicity) which, for example, matches passengers to planes and baggage, then the operation of the applications may be largely irrelevant or significantly impaired.
  4. Does the airline, as part of its BCDR testing, regularly seek to cut over to its BCDR systems and operate the business or parts of the business from them? If not, why not? Does the IT/CIO team have faith that the BCDR arrangements can deliver when required?
  5. Does the airline sufficiently enforce its contracts with third-party suppliers, to ensure that BCDR obligations are being implemented in practice, with an attendant transfer of appropriate risk, or at least an understanding of risk transfer and residual risk?
  6. Finally, does the airline have in place a major incident team—an “A-Team” from across the business and key external providers that can be mobilised at short notice to support the event? These individuals could make a critical difference to external suppliers’ posture and approach to resolving an issue, encouraging a “fix first” culture and avoiding the finger pointing politics that are often associated with service failures. If deficiencies are identified, then there is little doubt that investment will be required. Even if an airline believes it has outsourced this risk to a third-party hosting supplier, if the customer signed-off solution (as is typically the case) does not deliver a true business-enabling cut-over, then the airline can expect to have to spend to upgrade.
  7. An honest assessment of the above issues will determine the robustness of an airline’s BCDR arrangements; a proactive approach, harmonised across the business, and with the support of relevant third-party suppliers, is key.

However, investment in, and regular testing of, appropriate BCDR solutions is critical in mitigating potentially catastrophic events.