Power outage hits Craigslist

Filed under: Random Thoughts — barmijo — July 25, 2007 @ 5:59 pm

A power outage around
1PM yesterday
in the San Francisco area hit the data center at 365 Main,
home to Craigslist among others. The outage lasted some 45 minutes before the backup generators kicked in, but when power was restored the Craigslist staff found they had corrupted databases to rebuild. That work last until at least dawn. In a post responding to user inquiries about the outage and the lack of backup power a staffer writes:

Our colo charges us a serious amount of money to provide continuous uninterrupted power 24/7/365, even in the event of a blackout. Heck, they even say that their building is earthquake proof. They have huge backup generators, with two levels of failsafes, which they test monthly. Of course, during the power outage in downtown SF yesterday, these highly touted super backup generators failed to kick in.

Lots of big sites that share the facility with us were also down during this time, including LiveJournal, Alexa, Sun and CNET. Unfortunately for us, our DBs did not like having the power suddenly cut mid write (mentally multiply these writes by the 1000s of people who are simulateously posting). Once power was restored, and the DBs were brought back up, they were all corrupted, and had to be rebuilt.

Our system administrators spent over 12 hours working continously on this problem, some of them working past dawn. They derserve some serious kudos for getting the site back with almost no data loss, in a relatively small (considering the circumstances) amount of time.

Our “guaranteed power” colo has some serious explaining to do. To us, and plenty of other companies who pay them so that incidents like yesterday don’t ever happen.

All of which to answer your orignial question: We do try to have our website available continously, and actually shell out the money to host our machines at on of San Francisco’s top facilities. Obviously, that’s not good enough, and we already have plans in place to improve this situation.

Stories like this keep me excited about utility computing. Building fault tolerance into systems is incredibly difficult and expensive. The Craigslist team spent extra money and manpower doing everything by the book to avoid downtime, but still ended up with a significant outage. Ultimately, only having redundant operating copies of an app is sufficient to ensure availability, but that’s been horribly expensive and difficult. Utility computing changes that.

Greed and competition

Filed under: Random Thoughts — barmijo — July 20, 2007 @ 7:56 pm

Achieving mobility of data and applications will be necesary for the next major thrust of the internet.
Roadblocks to mobility are sprouting everywhere as one after another the internet giants create APIs of their own.
How we meld these APIs going forward I won’t venture a guess at, but Simon Wardley has a post on the merits of open source and standards in this effort titled Competition, not greed, is good and it’s worth a read. You may see in the comments that he and I have a slight difference of view on this, I being an OS skeptic, but that doesn’t make his view any less informative.

Has the Businessweek cover curse hit Digg?

Filed under: Random Thoughts — barmijo — @ 6:27 pm

Bussinessweek has a bit of a repuation for a cover curse, but I doubt that was
on Kevin Rose’s mind when they scheduled the photo shoot for their August 14, 2006 cover. I know I wasn’t thinking about it when I saw the magazone either. Though I’ve never been a big Digg user, I’ve met Kevin at conferences a couple times and he’s a great guy.

However, the cover and the curse did come to mind today when I was running some typical querries on Alexa and happened to
lookup Diggs traffic statistics. According to Alexa, Digg peaked just a few months after the cover and has slipped significantly ever since.

To be sure, Digg is still ranked the 121st most popular site as of today, so this isn’t to suggest they’re in any trouble. Still, they used to be ranked #60 and the trend over the past few months is clear and appears to be accelerating.

Secure Services in the Cloud

Filed under: Random Thoughts — barmijo — July 16, 2007 @ 2:52 pm

Christofer Hoff has a posting on delivering services using utility computing and new development tools like GME that’s well worth a read. Yes, it includes 3tera, but it isn’t all about us ;-)

Can SaaS be delivered as an appliance?

Filed under: Random Thoughts — barmijo — July 13, 2007 @ 11:50 pm

Last month Phil Wainright at ZDnet wrote a post about SaaS as an
appliance
. I didn’t think much about it until I read a related post from Ian Thomas that got me searching and I found this from Ross Mayfield.

After some noodling I’ve come to the conclusion that SaaS as an appliance a bit of a misnomer. Why? Well, at it’s heart the movement to SaaS is about the shift to offering services and not code, but the appliance model breaks that shift. I’ll cite just a few examples of what I mean.

First, the model only works for relatively small pieces of software and not apps like ERP or CRM. I think this is why Ross’s efforts at Socialtext have been succesful. Second, the recent posts are all bout virtual appliances, but that implies the IT shop has to build and maintain the infrastructure, in this case a Xen or VMware farm and SAN and possibly a database. Plus, IT will have to be responsible for backups. Last, as Ross found out, once the app is inside the firewall, the ISV isn’t likely to have carte blanche for upgrades, so they’ll be forced to workout upgrade windows.

There are other implications for the ISV. IMHO support costs will be higher because in addition supporting many different releases simulataneously you’ll also have installation, upgrade and compatability issues to contend with.

Since here at 3tera we build utility computing software I have to admit I’m biased, but SaaS as an appliance really sounds like marketing hype to me.

IBM’s second life in the data center

Filed under: Random Thoughts — barmijo — July 12, 2007 @ 10:57 am

IBM has built a data center in Second Life. Being a geek, I found this a fascinating development, but unfortunately, based on the video, it looks to be just a sales ploy with no real technology backing it up.

Last year we played around with the idea of putting a data center in Second Life that usrs could actually control – after all, that’s what we do. Users would be able to build infrastructure in Second Life and it would be backed up in real-time by resources at one of our partners. Alas, though it’s an interesting idea, there just haven’t been cycles to implement it.

Personal downtime

Filed under: Random Thoughts — barmijo — @ 10:50 am

Downtime for systems – a bad thing. Downtime for people – priceless.

OK, it’s an overused line, but true none the less. After well over a year with no serious vacation, I spent the past week and a half with family in mid-west. A week on the shore of Lake Eerie boating with the kids and setting off a few hundred dollars worth of fireworks (something we can’t do in California). With the exception of a couple dozen mosquito souveniers it was just the battery recharge I needed.

This blog is powered by WordPress running on AppLogic standard LAMP cluster.   RSS feed