What does the Amazon EC2 downtime mean?

By Mark Skilton, Capgemini

The recent announcement of the Amazon EC2 outage in April this year triggers some thoughts about this very high-profile topic in Cloud Computing. How secure and available is your data in the Cloud?

While the outage was more to do with the service level availability (SLA) of data and services from your Cloud provider, the recent, potentially more concerning risk of Epsilon e-mail data stolen, and as I write this the Sony email theft is breaking news, further highlights this big topic in Cloud Computing.

My initial reaction on hearing the about the outage was that it was due to over-allocation due to high demand in the US EAST 2 region, which led to a cascade system failure. I subsequently read that Amazon said it was a network glitch, which triggered storage backups to automatically create more than needed, consuming the elastic block storage. This in turn, I theorized, seems to have created the supply unavailability problem.

From a business perspective, this focuses on the issues of using a primary Cloud provider. The businesses like Quora.com and foursquare.com that were affected “live in the Cloud,” yet backup and secondary Cloud support needs are clearly important.  Some of these are economic decisions, trade-offs between loss of business and business continuity. It highlights the vulnerability of these enterprises even though a highly successful organization like Amazon makes this a rare event. Consumers of Cloud services need to consider taking mitigating actions such as disruption insurance; having secondary backups; and the issues of assurances of SLAs, which are largely out of the hands of SMB Market users. A result of outages in Cloud providers has been the emergence of a new market called “Cloud Backup,” which is starting to gain favor with customers and providers in providing added levels of protection of service fail over.

While these are concerning issues, I believe most outage issues may be addressed by taking due diligence in the procurement and usage behavior of any service that involves a third party. I’ve expanding the definition of due diligence in Cloud Computing to include at least six key processes that any prospective Cloud buyer should be aware and make contingency for, as you would with any purchase of a business critical service:

  • Security management
  • Compliance management
  • Service Management (ITSM and License controls)
  • Performance management
  • Account management
  • Ecosystem standards management

I don’t think publishing a bill of rights for consumers is enough to insure against failure. One thing that Cloud Computing design has taught me is that part of the architectural shift brought about by Cloud is the emergence of automation as an implicit part of the operating model design to enable elasticity. This automation may have been a factor, ironically, in the Amazon situation, but overall the benefits of Cloud far outweigh the downsides, which can be re-engineered and resolved.

A useful guide to address some of the business impact can be found in a new book by The Open Group on Cloud Computing for Business that we plan to publish this quarter. The topics of the book address many of these challenges in understanding and driving the value of the Cloud Computing in the language of business. The book covers chapters relating to business use of cloud and includes topics of risk management of the Cloud. Check The Open Group website for more information on The Open Group Cloud Computing Work Group and the Cloud publications in the bookstore at http://www.opengroup.org.

Cloud Computing isa key topic of discussion at The Open Group Conference, London, May 9-13, which is currently underway. 

Mark Skilton, Director, Capgemini, is the Co-Chair of The Open Group Cloud Computing Work Group. He has been involved in advising clients and developing of strategic portfolio services in Cloud Computing and business transformation. His recent contributions include the publication of Return on Investment models on Cloud Computing widely syndicated that achieved 50,000 hits on CIO.com and in the British Computer Society 2010 Annual Review. His current activities include development of a new Cloud Computing Model standards and best practices on the subject of Cloud Computing impact on Outsourcing and Off-shoring models and contributed to the second edition of the Handbook of Global Outsourcing and Off-shoring published through his involvement with Warwick Business School UK Specialist Masters Degree Program in Information Systems Management.

One comment

  1. Cloud computing enables users to store files and software remotely, rather than on a hard drive or server at their office. The fact is many people may already be using cloud computing without realizing it, whether through work or personal use.

Comments are closed.