Skip to main content

Incident White Paper

When Transformation Becomes the Incident

The TSB 2018 Migration Failure and What It Reveals About Governance

November 12, 2024 — Frank Kahle

Executive Summary

In April 2018, TSB Bank initiated what should have been a routine technology migration: moving customer accounts from Lloyds Banking Group's systems to Sabadell's Proteo4UK platform. Instead, it became one of the UK's most costly banking failures. Within hours of go-live on April 20th, 1.9 million customers lost access to their accounts. Some discovered money had disappeared. Others could see strangers' account details. The incident persisted for weeks. The CEO resigned. Regulators imposed a £48.65 million fine. Fraud spiked 70 times above normal. And the total cost exceeded £330 million. Most critical: no external attacker caused this. TSB did it to itself—through governance failures, timeline pressure, and the illusion that internal IT programmes are inherently safe.

What Failed

TSB's migration from Lloyds Banking Group systems to Proteo4UK was not a failure of technology alone. It was a failure of judgment, governance, and risk management at every level.

The platform was not ready. According to the independent review conducted by law firm Slaughter and May, the Proteo4UK platform was simply not capable of handling TSB's full customer base at the moment of launch. There were over 2,000 known defects relating to testing. The board was aware of fewer than 800 of them.

The supplier was not ready. Sabis, the company responsible for operating Proteo4UK, is part of TSB's parent company Sabadell. Because they were internal, TSB applied a lower standard of scrutiny than it would have applied to an independent third party. The Slaughter and May review concluded plainly: "Sabis was not ready to operate the Proteo4UK." Staff were undertrained. Processes were incomplete. The architecture had never been tested at scale.

Testing was insufficient. The new data centre was never adequately tested before go-live. Critical infrastructure components were not validated. What testing did occur revealed systemic problems that should have triggered delays, but instead the timeline held.

The "big bang" approach created concentration of risk. Rather than a phased migration of customer segments, TSB moved all 5.2 million customers in a single operation. One failure point. No rollback contingency. One catastrophic fall.

The Decision Timeline: How Pressure Overrode Warning

Early 2018: Warnings Escalate

Technical teams, operations staff, and third-party advisors raise concerns about readiness. Testing uncovers thousands of defects. The CIO and operational teams recommend delay.

Board Pressure: April 20th Must Hold

The board, focused on strategic timelines and milestone delivery, decides the migration date will not slip. The decision prioritises schedule over readiness. Warnings are noted but overridden.

April 20, 2018: Go-Live

The migration begins. Within hours, systems begin to fail. Customers report account lockouts. Some customer records corrupt. A small subset of users can see other customers' full account details.

April 20–24: The Crisis Deepens

1.9 million customers cannot access online banking, mobile banking, or telephone banking. Branch networks are overwhelmed. Customers miss bill payments. Standing orders fail. Credit scores are damaged. Disputes break out in customer service queues. IBM is brought in to help stabilise the platform.

April–July 2018: Intermittent Service, Rising Fraud

Service gradually returns, but instability persists. Fraud attacks surge to 70 times normal levels as fraudsters exploit the chaos and the compromised security posture. 2,200 customers experience fraudulent account access attempts; 1,300 suffer financial loss.

September 2018: CEO Resignation

CEO Paul Pester, who led the board's decision to maintain the April 20th date, resigns. The reputational damage and customer attrition are severe.

April 2019 (1 Year Later): 225,492 Complaints Filed

By the one-year anniversary, approximately 4.3% of TSB's customer base had lodged formal complaints. The bank had issued over £370 million in customer compensation.

December 2022: Regulatory Fine

The FCA and PRA announce combined fines of £48.65 million. TSB qualifies for a 30% discount for cooperation; without it, the fine would have been £69.5 million.

When the Customer Becomes the Victim

The human impact of TSB's migration failure is often sanitised in post-mortems. The numbers are clean: 1.9 million locked out. 225,492 complaints. But numbers erase the lived experience.

People woke up on April 21st unable to check their balance or pay bills. Small businesses could not process payroll. Single parents could not access their benefit payments. Someone's mortgage payment bounced. Someone else's child benefit failed to clear. Customers who had banked with TSB for decades, who had transferred their salary, their rent, their entire financial life to TSB's platform, suddenly had no visibility into their own money.

Then came the security nightmare. Some TSB customers discovered they could see other people's accounts—full balance, transaction history, personal details. Others received notifications of login attempts from locations they had never visited. The compromised data centre, the overwhelmed security infrastructure, and the desperate scramble to restore service had destroyed the basic promise of banking: privacy and control over your own money.

The fraud surge. As TSB's systems struggled and customers were locked out of their accounts, fraudsters moved in. At their peak, fraudulent access attempts reached 70 times the normal level. This was not sophisticated attacks. This was opportunistic abuse of chaos. Criminals knew that TSB's security posture was compromised, that customer data had been exposed, and that the bank was too overwhelmed to respond quickly. Over two weeks, 2,200 customers experienced fraudulent attempts on their accounts. 1,300 of those customers suffered actual financial loss.

These were not wealthy customers covered by insurance. These were ordinary people discovering that their bank's failure had exposed them to theft, and that the bank's incompetence had made them targets.

What the Incident Exposed

Governance failures at the board level

The board received warnings from operations, from the CIO, from third-party advisors. The Slaughter and May review found that the board was explicitly told about 800 known defects but not about the full 2,000. Whether by design or neglect, the board's risk view was incomplete. When recommendations to delay the migration reached the board, the board chose speed over readiness. That is a governance choice. That choice has a cost.

Underestimation of outsourcing risk

TSB owns Sabadell, which owns Sabis, which operates Proteo4UK. This nested ownership created a false sense of control. TSB did not apply the same rigorous vendor assessment to Sabis that it would have applied to an independent supplier. The result was that TSB discovered—only after go-live—that Sabis lacked the capacity, training, and processes to operate the platform. The lesson is that ownership creates risk blindness. Just because a supplier is part of your parent company does not mean they are ready to do the work.

Concentration of risk through "big bang" migration

There were alternatives. TSB could have migrated customers in waves—10% first, then 25%, then 50%, then 100%. Each wave would have validated the platform and the operations capability. A failure at 10% is a contained incident. A failure at 100% is a catastrophe. TSB chose the big bang. The board knew the risks. The choice was made anyway.

The fragility of critical financial infrastructure

When TSB's systems failed, there was no graceful fallback. Customers could not be switched back to the old platform. Data integrity had been compromised. Branch staff lacked tools to help customers manually. The entire financial institution—its ability to do its core job—collapsed because one technology programme failed. There was no operational resilience. There was no contingency.

Security failures during crisis

In the chaos of the outage, basic security controls were bypassed or broke. Customers saw other customers' accounts. The data centre, under stress, became porous. This is not a failing of the new platform alone; it is a failing of how TSB managed the crisis itself. Security was subordinated to service restoration. The result was that customers became victims not only of an outage but of exposure and fraud.

The Resilience Lens

Most organisations rehearse crisis scenarios as if they are initiated by external threats: cyberattacks, natural disasters, geopolitical shocks. The implicit assumption is that internal operations—especially IT migrations, transformations, and infrastructure upgrades—are inherently stable and controllable.

TSB is a vivid counter-argument. No attacker infiltrated TSB's systems. No natural disaster struck. No geopolitical event disrupted supply chains. TSB did this to itself. And the cost—£330 million, 80,000 customers lost, a CEO forced to resign, a regulatory fine in the tens of millions, reputational damage that persisted for years—demonstrates that the greatest threat to some organisations is not external. It is internal. It is the transformation programme you approved last quarter.

Resilience is not just about defending against attackers. It is about building redundancy into programmes that are inherently risky. It is about giving yourself the option to pause, to roll back, to fail small instead of failing big. It is about creating governance that is not captured by schedule pressure. It is about asking: what is the actual failure mode here, and have we eliminated it? Not managed it. Eliminated it.

TSB had none of that. The governance was overridden by timeline. The redundancy was eliminated by the big bang approach. The option to fail small was gone. What remained was a single point of failure and a board that chose to activate it.

What Boards Should Be Asking Now

What is the readiness standard for our critical IT migrations?

Does your board know what "ready" actually means? Is it a go/no-go checklist signed off by operations, or is it a set of quantifiable criteria that must be met? TSB's board received partial information (800 known defects, not 2,000). How do you ensure the board's risk picture is complete?

Who has the authority to say no to a migration date?

If your CIO, your COO, and your third-party advisors all recommend delay, can the board override them? TSB's board did. What would it take in your organisation to force a date slip? How senior does the dissent need to be?

Is our critical infrastructure dependent on a supplier we have not recently assessed?

If that supplier is part of your parent company or is a long-standing partner, the assessment gap is even larger. When was Sabis last independently assessed? For your own critical suppliers, when was the last proper capability review?

What is our rollback plan, and has it been tested?

TSB could not roll back. The old platform was decommissioned. The customer data had been migrated and corrupted in the new system. There was no going back. For your next major IT migration, what is the actual rollback procedure? How long would it take? Has it been exercised?

How would we operate if our primary IT systems failed?

This is the question TSB could not answer. When the migration platform failed, the bank had no contingency. Branch staff had no manual processes. Customers had no alternative channel. Operational resilience means building the capability to serve customers even if your primary systems are down. How would you do that?

Conclusion

The TSB 2018 migration failure happened because a board chose to optimise for schedule over readiness. The decision was not made by reckless people. It was made by reasonable people operating within a governance structure that allowed schedule pressure to override technical and operational warnings. That governance structure, the pressure it created, and the choice it enabled—that is what makes this incident worth studying.

For boards and leadership teams with major IT transformations on the roadmap, the question is not whether you can learn from TSB. It is whether you will. A failed migration is no longer a technical problem to be managed by the CIO and the operations team. It is a business crisis that can cost hundreds of millions, trigger regulatory action, force executive resignations, and expose customers to fraud. It is an incident. It belongs in your board-level risk register alongside cyberattacks, supply chain disruption, and geopolitical shocks.

The painful clarity of TSB's failure is that the risk was entirely visible and entirely avoidable. The warnings were given. The defects were known. The readiness was questioned. The choice was made anyway. Do not let that choice be made in your organisation. Build governance that forces the hard trade-offs to the surface. Demand quantifiable readiness criteria. Give your CIO and COO the authority to say no. Create rollback plans that are actually executable. Build operational resilience into the design, not added as an afterthought. And ask yourself: if this transformation failed at go-live, how would we actually operate? If you cannot answer that question clearly, you are not ready to migrate.

Rehearse This Scenario

The TSB incident reveals how transformation programmes can become crises when governance breaks down. CrisisLoop helps boards and leadership teams rehearse these scenarios—not to assign blame, but to build resilience.

From pre-migration readiness assessments to crisis response simulations, our approach ensures that your organisation can identify risks before they become incidents.

Talk to Us About Resilience Rehearsal

Sources