Four Lessons from the AWS Outage Last Week
The Amazon Web Services (AWS) Simple Storage Service (S3) experienced an outage on Tuesday last week and was down for several hours. S3 is object storage for around 150,000 websites and other services according to SimilarTech. For IT professionals, here are four takeaways from this outage.
#1 – It Happens
No infrastructure in immune to outages. No matter how big the provider, outages happen and downtime occurs. Whether you are hosting infrastructure yourself or relying on a third party, outages will happen eventually. Putting your eggs in someone else’s basket does not necessarily buy you any more peace of mind. In this case, S3 was brought down by a simple typo from a single individual. That is as little as it takes to cause so much disruption. The premiums you pay to be hosted on a massive infrastructure like AWS will never prevent the inevitable failures, no matter how massive any platform becomes.
#2 – The Bigger They Are, the Harder They Fall
When a service is as massive as AWS, problems affect millions of users like customers trying to do businesses with companies using S3. Yes, outages do happen but do they have to take down so much of the internet with them when they do? Like the DDOS attack I blogged about last fall, companies leave themselves open to these massive outages when they rely heavily on public cloud services. How much more confidence in your business would your customers have if they heard about a massive outage on the news but knew that your systems were unaffected?
#3 – It’s No Use Being an Armchair Quarterback
When an outage occurs with your third party provider, you call, you monitor, and you wait. You hear about what is happening and all you can do is shake your fist in the air knowing that you probably could have done better to either prevent the issue or resolve it more quickly if you were in control. But you aren’t in any position to do anything because you are reliant on the hoster. You have no option but to simply accept the outage and try to make up for the loss to your business. You gave up your ability to fix the problem when you gave that responsibility to someone else.
Just two weeks ago, I blogged about private cloud and why some organizations feel they can’t rely on hosted solutions because of any number of failures they would have no control over. If you need control of your solution to mitigate risk, you can’t also give that control to a third party.
#4 – Have a Plan
Cloud services are a part of IT these days and most companies are already doing some form of hybrid cloud with some services hosted locally and some hosted in the cloud. Cloud-based applications like Salesforce, Office365, and Google Docs have millions of users. It is inevitable that some of your services will be cloud-based, but they don’t all have to be. There are plenty of solutions like hyperconverged infrastructure to host many services locally with the simplicity of cloud infrastructure. When outages at cloud providers occur, make sure you have sufficient infrastructure in place locally so that you can do more than just be an armchair quarterback.
Public cloud services may be part of your playbook but they don’t have to be your endgame. Take control of your data center and have the ability to navigate your business through outages without being at the mercy of third party providers. Have a plan, have an infrastructure, and be ready for the next time the internet breaks.