On TV.com: Why Is Everyone in TV High School SO OLD
BNET Business Network:
BNET
TechRepublic
ZDNet

February 15th, 2008

What happens when the cloud doesn't work?

Posted by Larry Dignan @ 7:10 am

Categories: Amazon, Cloud computing, General, Software Infrastructure, Web Technology

Tags: Amazon.com Inc., Manufacturing, Backups, Data Centers, Service Level Management, Open Source, Storage, Hardware, Data Management, It Operations

Update below: Cloud services sound great. A company can host their infrastructure with a large player like Amazon and Google, spend little and grow the business. Data center investment? Why would you do something like that?

Those theories are being tested today as Michael Krigsman is on the case of a major Amazon Web services outage. Amazon recently installed an SLA promising 99 percent uptime so any financial hit will be determined later. For now, customers are getting a lesson in backup options (Techmeme discussion).

As Michael notes this outage could have big implications since Amazon is increasingly hosting enterprise-class software such as Red Hat Enterprise Linux. Amazon is even rumored to be in the sweepstakes to host SAP’s BusinessByDesign.

For now your best bet is to monitor Amazon’s message board to see what happens when the cloud goes awry. A few choice excerpts:

  • Hi, what is the deadline to fix this inssue, because i have many clients using the S3 service.
  • And this is why you have to setup a fail-safe. My new sites hosts over 25,000 images on Amazon and I wake up to notice major issues this morning. I switched over to using my local server and everything is back up…I really need to set something up so it does this automatically. The s3 service is great but this just proves you can’t rely on it, this is a major issue especially since it’s been down for so long. Way to go Amazon.
  • This is really a severe blow to confidence in trusting AWS services.

Update: Amazon has resolved the issue, adding in a post.

We’ve resolved this issue, and performance is returning to normal levels for all Amazon Web Services that were impacted. We apologize for the inconvenience. Please stay tuned to this thread for more information about this issue.

The question now is whether folks view this spell as mere growing pains or something larger to worry about.

Update 2: Suggestion of the day from an Amazon customer:

A health monitor would be useful — something to show what amazon thinks the status of the services are and to post official information. Maybe even proactive alerts or something I could tie our other infrastructure notifications into so I could be proactive in alerting our downstream affected users.

That idea isn’t original, but is pretty handy. After a series of outages, Salesforce.com created a similar dashboard.

Larry DignanLarry Dignan is Editor in Chief of ZDNet and Editorial Director of ZDNet sister site TechRepublic. See his full profile and disclosure of his industry affiliations.

For daily updates, follow Larry on Twitter.

Email Larry Dignan

Subscribe to Between the Lines via Email alerts or RSS.

  • Talkback
  • Most Recent of 15 Talkback(s)
Still, small to mid-size business would have a very hard time providing 99%
uptime. The cost for them would be prohibitive. (Read the rest)
Posted by: DonnieBoy Posted on: 02/19/08 You are currently: a Guest | | Terms of Use
What's the saying  Glados | 02/15/08
Duh,,,,  croberts | 02/15/08
Electricity does not work 100% of the time either, and, with no electricity  DonnieBoy | 02/15/08
Electricity does not work 100%  misceng | 02/16/08
You have to remember we rely 100% on electricity as well. Electricity also  DonnieBoy | 02/15/08
Interesting point. what are the expectations  Larry DignanZDNet Moderator | 02/15/08
How about reliability as good as electricity, and also offline support  DonnieBoy | 02/15/08
Cloud computing:  Userama | 02/15/08
The same couldbe said for those depending on electricity. Remember the  DonnieBoy | 02/15/08
RE: It's a friction(less) problem: Expect more like it  BobWarfield | 02/15/08
RE: What happens when the cloud doesn't work?  seanjohn23@... | 02/15/08
99% SLA == 88 hours downtime a year  georgeou | 02/15/08
RE: 99% SLA == 88 hours downtime  bfilipiak@... | 02/18/08
But, for retailers, they have the same reliability problems if they try to  DonnieBoy | 02/19/08
Still, small to mid-size business would have a very hard time providing 99%  DonnieBoy | 02/19/08

What do you think?

SponsoredWhite Papers, Webcasts, and Downloads

advertisement

Recent Entries

Archives

Favorite Links

ZDNet Blogs

White Papers, Webcasts, and Downloads

Enterprise Applications

  • Check out some of the easiest and most powerful ways to boost productivity while saving money on your application infrastructure. See ZDNet's comprehensive Enterprise Application resource center, now!
  • New Online Dashboard
  • Read about top issues IT decision-makers face every day, plus get cost effective solutions to real life IT problems. Oracle Topline