The Amazon Web Services (AWS) cloud outage was disastrous.
The December 7th, 2021 outage lasted a whopping 7 hours, bringing applications that run on the AWS platform to a grinding halt.
One would expect that Amazon would implement solid measures to prevent a cloud outage of such magnitude from happening in the future.
But on December 15th, barely a week after the incident occurred, AWS was already dealing with a second outage, lasting for about forty-five minutes. This time the outage affected the company’s US-West 1 and US-West 2 regions.
On October 4th, 2021, Facebook and its subsidiaries, including WhatsApp and Instagram, disappeared from the internet. It would be more than five hours before users could reaccess the services.
According to Facebook, the outage was primarily due to the misconfiguration of the company’s interconnected backbone servers.
By extension, both outages brought to light the challenges of keeping interlinked systems up and running.
These outages can only mean one thing for businesses that depend on cloud services; it is crucial to anchor your infrastructure on carrier-neutral colocation data centers. Why? …because that’s the safest way to secure your enterprise data from the effects of a possible outage.
What Caused the Facebook Outage?
Just before 5 pm UTC, users started noticing that they couldn’t access Facebook, Messenger, WhatsApp, and Instagram.
According to a statement issued by Facebook, a configuration change to the backbone servers that synchronizes the traffic between the company’s data centers triggered a cascading effect, causing the outage.
The incident meant that not only was Facebook gone, but also everything Facebook runs vanished too.
More specifically, the outage involved two things: Domain Name System (DNS) and Border Gateway Protocol (BGP).
DNS is the address system for the location of each website – it is the IP address. BGP, on the other hand, is the roadmap that determines the most efficient route to that IP address.
Facebook, through a series of updates, told BGP that specific paths to the platform didn’t exist. Essentially, this meant that users trying to reach Facebook couldn’t access the path to reach the platform.
What Caused the AWS Cloud Outage?
According to AWS, the outage was triggered by a glitch in the company’s internal networks hosting crucial services like application and service monitoring, AWS’s DNS, sections of the Elastic Cloud 2 (EC2), and authorization.
DNS was particularly a significant contributor since it is the system charged with translating human-readable names to numeric internet, in this case, the IP addresses.
At 7:30 am PST, an automated operation to increase the capacity of AWS’s services hosted in the company’s main network prompted an unanticipated behavior from multiple clients in the internal network.
A large-scale surge of connection activity ensued, overwhelming the networking devices between the internal network and AWS’s main network. Also, the surge caused communication delays between these networks.
These delays triggered latency and errors for foundation services communicating between the networks, spurring even more failing connection attempts that eventually led to “nonstop congestion and performance problems” on the network devices.
Because AWS controls 33 percent of global cloud infrastructure, research universities, local school districts, small businesses, and large enterprises were affected by the outage.
Why Did It Take So Long to Fix These Cloud Outages?
Facebook runs its internal systems for the same place, making it hard for employees to diagnose and resolve the problem.
Put differently, Facebook runs EVERYTHING through Facebook, so the usual way the company’s IT and security would fix a problem like this wasn’t working.
Facebook’s team was allegedly unable to access the platform’s communications platform and Workplace. What’s more, the team couldn’t access their offices because of the security pass system caught up in the outage.
Facebook stated that the severity of the outage meant the teams had to bring the systems to full capacity slowly.
For AWS, the loss of connection between two networks meant that the internal operating team lost visibility into the company’s real-time monitoring service and were forced and had to depend on the past-event logs to diagnose the problem.
On top of that, AWS’s internal deployment teams were blocked-up, slowing recovery. What’s more, there was concern that fixing internal-to-main network communications would interrupt other customer-facing AWS services that weren’t affected.
What Can Such Cloud Outages Mean for Your Business?
Outages like Amazon and Facebook can be fatal for small and large businesses alike.
In fact, in every probability, your IT and security teams dread outages of such magnitude. That’s a nightmare they’d rather not experience.
Yet, it is crucial for your team to make sure that your organization’s IT infrastructure is secure at all times, even when you have a solid cybersecurity strategy in place.
And because you want your business to run like a well-oiled machine, a major system failure can quickly overwhelm operations, causing significant losses.
While you can leverage hybrid cloud, multi-cloud and multi-region strategies to tame the impact of a big outage, such measures are hardly sufficient – and that’s where Volico carrier-neutral colocation comes in.
How Can Volico’s Carrier Neutrality Help Your Business Deal with a Cloud Outage?
Volico understands how outages like the ones encountered by AWS and Facebook can impair damage to your business.
We also know that such an outage can strain your relationships with your customers and stress out your IT and security teams.
However, you can protect your business from the adverse effects of an infrastructure outage by leveraging Volico neutral colocation services.
Choosing Volico carrier-neutral colocation services will benefit your business in the following ways:
Enhanced Reliability and Redundancy
A single outage can cost your business hundreds of thousands of dollars. For instance, a one-hour downtime on Prime Day in 2018 cost Amazon a whopping $100 million in sales!
Volico neutral colocation enables you to connect your most essential systems to multiple data centers. This way, you can enjoy redundancy that can’t be achieved in a singer-carrier environment.
Volico colocation helps ensure business continuity if one carrier experiences a system failure.
The costs of working with a single-carrier data center transcend the price. Partnering with a single carrier exposes your business to unwarranted price hikes and reduced bandwidth.
Plus, the cost of relocating to a new data center if the current one can handle your needs means digging deeper into your pocket.
Volico Data Centers has a strong incentive to offer competitive pricing and better services by offering multiple connection options. Further, you can experience greater savings by incorporating Volico Smart Hand Services to accommodate your changing business needs.
With our smart hand services, you’ll have round the round access to experienced technicians, so you don’t have to send your IT staff to the data center facility, which means even more savings for your business.
Any serious business should consider how its operations are likely to shift and grow in the future. So, it would be best to choose a solution that grows with your business.
Volico neutral colocation services can scale to meet your specific business needs as they change. In addition, you’ll have all the options offered by various cloud providers and ISPs at your disposal.
We provide tailored features that you can take advantage of based on your business goals, whether you’re talking about physical infrastructure or service.
More Data Protection Options
Working with a data center that offers carrier-neutral services offers even greater protection for your business-critical data loss.
Why is it crucial to minimize the probability of data loss?
Well, because the costs of an outage transcend direct expenses. Indirect costs include:
- Loss of future sales.
- Poor relationships with your trading partners.
- The inability to fulfill your obligations.
A reliable carrier-neutral colocation data center like Volico can offer guaranteed uptime to protect your business-critical data while meeting your business requirements.
Volico Services – Protecting your Business Against Cloud Outages
Volico is a carrier-neutral colocation provider, offering your business multiple connectivity options to many carriers.
By leveraging Volico carrier-neutral services, you can experience optimal uptime, redundancy, and cost-efficiency while protecting your business-critical data 24/7.
If you’d like to learn more about our carrier-neutral solutions for your business, we’d be happy to talk to you.