
The recent Amazon Web Services (AWS) outage reminded businesses across industries of an uncomfortable truth: even the largest and most advanced providers aren’t immune to failure.
From data center malfunctions to targeted cyberattacks, outages have become part of the modern operating landscape. The real question is no longer if an incident will happen – it’s how ready your organization is when it does.
Together with Tautvydas Jašinskas, Chief Information Security Officer (CISO) at ConnectPay, let’s look at what often turns a minor outage into a major crisis – and what companies can do to stay resilient.
Business Continuity Plans: written vs. practiced
Many organizations have a Business Continuity Plan (BCP) stored in a shared drive, but few regularly test it.
Why this matters:
- Plans created once and never updated quickly become outdated.
- Staff turnover means that new employees may not be familiar with their role in a crisis.
- Technology and vendor changes often leave recovery steps incomplete.
“The difference between a plan that exists and a plan that works is testing,” explains Jašinskas. “Companies must simulate real-world scenarios – from cloud outages to data corruption – to see how fast they can restore normal operations.”
Running live exercises helps identify weak points, such as unclear responsibilities, missing contacts, or technical dependencies that haven’t been documented and addressed.
The human factor: training turns policy into action
Technology alone doesn’t guarantee resilience – people do. They’re the strongest joint in any organization. During an outage, every minute counts, and confusion spreads quickly if teams haven’t practiced their roles.
Why this remains a challenge:
- IT teams focus heavily on prevention, not recovery drills.
- Decision-making authority during incidents is often unclear.
- Employees assume “someone else” will take action.
“When systems go down, you don’t have time to read instructions,” says Jašinskas. “The instinct to react is built through training – the quiet preparation that turns uncertainty into action.”
Regular tabletop exercises or live simulations ensure that when an outage occurs, teams respond automatically—not reactively.
Third-party dependencies: the hidden weak link
Cloud service providers, payment gateways, and data processors form the backbone of digital operations—but they also create shared risk.
Why it’s critical to manage:
- Most outages originate outside an organization’s direct control.
- Vendors may not disclose downtime quickly or transparently.
- Recovery timelines differ, leaving businesses in limbo.
“Almost every company depends on a chain of providers,” Jašinskas points out. “That means your resilience is only as strong as the weakest link in that chain.”
Firms should map their dependencies, assess vendor resilience, and ensure that SLAs (service-level agreements) clearly define expectations for uptime, response times, and communication during incidents. If one provider fails, there must be a tested backup route—whether that means a secondary cloud region, payment processor, or communications channel.
Testing for reality, not for compliance
Compliance frameworks emphasize operational resilience, but checklists alone don’t prepare firms for real-world disruptions. The goal isn’t just passing an audit—it’s maintaining continuity and business trust when your systems go offline.
“Many organizations approach resilience and business continuity in a bureaucratic way – writing about it more than they actually do it. They create policies, procedures, and documents, but resilience isn’t built on paperwork. It’s built on practice,” Jašinskas notes.
True resilience comes from repetition – from realistic, sometimes uncomfortable testing that exposes weak points and teaches teams how to respond. And the irony is that no real incident ever unfolds exactly like the scenario you’ve rehearsed. But the real value of testing lies in learning how to solve problems – recognizing patterns, finding causes, and restoring order when plans no longer apply. Companies that regularly test and adjust recover faster, communicate more effectively, and experience less business disruption when it strikes.
Building a culture of resilience
Actual readiness extends beyond IT. Finance, operations, customer support, and communications all play a role in minimizing impact.
What defines a resilient organization?
- Prepared teams who know their responsibilities and have their deputies assigned and trained.
- Tested procedures that reflect the current tech environment.
- Transparent communication with partners and clients during incidents.
- Continuous improvement after every drill or real disruption.
“In practice, resilience has to become a habit,” Jašinskas emphasizes. “If you only think about continuity once a year, you’re already behind.”
Embedding resilience thinking into everyday operations—rather than treating it as a once-a-year compliance task—is what separates proactive organizations from those caught off guard.
As Archilochus said, “We don’t rise to the level of our expectations; we fall to the level of our training.” Outages and cyber incidents aren’t going away, but their impact can be mitigated. Companies that recover fastest share one thing in common: they train, test, and adapt continuously.