AWS #Fail: How Clients Could Have Avoided It

21 Apr

It has not been a good day for anyone running an application on Amazon Web Services' ultra-popular EC2. Several data centers experienced a major outage: Down for 12 hours, and as a result, hundreds of sites, big and small, down too. Services like Quora, Reddit, Heroku and Foursquare have been out all day, due to the failure of multiple "availability zones" in Amazon's east coast data centers. For a detailed explanation of the failure, check out this post. For a (mostly) comprehensive list of the sites affected by the downtime, click here.

Obviously, users all across the Web are royally ticked off. Businesses and solution providers are having trouble explaining the situation to clients, developers are feeling like they've been caught with their pants down, and the Internet snark machine is having a field day.

One EC2 client said around 2pm CT that "we're told it will be back up in an hour. But they've been saying that for awhile. One way we could have prevented against this is if we had spread out across multiple data centers around the world."

It's important to remember that just because one cloud provider is having problems, it doesn't mean that all cloud is bad. It just means that businesses, developers and end users, like the EC2 client, need to plan better. That means being able to fail over to a whole different cloud, instead of just to a different cluster of the same cloud.

That's where Standing Cloud comes in. We enable our users to "cloud-hop;" in other words, if the cloud your site or application is on experiences an outage, you can spin up a backup on a different cloud in the blink of an eye. So instead of flipping out about downtime, you can relax and get back to work.

Post new comment

The content of this field is kept private and will not be shown publicly.