Last week, and into the weekend, Amazon Web Services experienced an enormous outage with widespread consequences, taking down major websites and services like Quora, Reddit, Heroku and Foursquare. Users are still (understandably) angry, both at the failure of Amazon's much-vaunted Availability Zone structure and at the lack of communication from within Amazon itself. However, there has been a flood of posts and articles insisting that this debacle is conclusive proof that "the cloud isn't ready for prime-time" or that "the cloud will never work." Our CEO Dave Jilk took to Brad Feld's blog to explain why these are the wrong ideas to take away from the outage, and what we should be learning instead.
Dave writes: "Those who say [that the cloud is not ready for prime time] simply do not understand what the infrastructure cloud is. At bottom, it is just a way to provision virtual servers in a data center without human involvement. It is not news to anyone who uses them that virtual servers are individually less reliable than physical servers; furthermore, those virtual servers run on physical servers inside a physical data center. All physical data centers have glitches and downtime, and this is not the first time Amazon has had an outage, although it is the most severe.
"What is true is that the infrastructure cloud is not and never will be ready to be used exactly like a traditional physical data center that is under your control. But that is obvious after a moment’s reflection. So when you see someone claiming that the Amazon outage shows that the cloud is not ready, they are just waving an ignorance flag."
Dave then goes on to explain that "Amazon is not infallible, and the cloud is not magic," saying that "cloud portability is one of the things Standing Cloud enables for the applications it manages. If you build/deploy/manage an application using our system, it will be able to run on many different cloud providers, and you can move it easily and quickly.
"We built this capability, though, because we believed that it was important for risk mitigation. As I have already pointed out, no data center is infallible and outages are inevitable. Further, It is not enough to have access to multiple data centers – the Amazon outage, though focused on one data center, created cascading effects (due to volume) in its other data centers. This, too, was predictable."