Why has true Utility Computing taken so long to achieve?

Filed under: Random Thoughts — barmijo — December 1, 2005 @ 12:00 am

‘ve spoken with many people who, when I tell them I’m working on utility computing, scoff that true utility
computing is either a pipe dream or a decade away. Unfortunately, these folks aren’t alone. InfoWorld declared utility computing “a dream deferred” and you can find many other articles expressing dissatisfaction with vendor offerings so far.

Trust me, I understand and share the frustration. After all, the first time I heard the term utility computing was at an Infiniband Trade Association conference more than five years ago, yet in my humble opinion I still haven’t seen a product or service that delivers on the promise. Most utility computing offers are still little more than a wrapper of service around hosting and outsourcing with a contract designed to lock in the customer. Hardly a utility. So, what’s caused the delay? I believe that until recently we were missing a basic capability necesary to make utility computing a reality.


A fundamental missing building block

Just over five years ago I had the good fortune to help co-found Topspin Communications with a great bunch of compatriots. After the dust settled on the dotcom implosion we set out to build a switching system that would be the core of a new breed of data centers. Our system would provide a single fabric combining traditional network connections as well as storage and interprocess connections. On top of the fabric ran a connection system called V-Frame that dynamically applied policies and logical connections defined by the operator to establishing physical connections. If successful, we hoped to make hardware resources interchangeable through software. We didn’t envision it as utility computing, but more simply as a way to greatly simplify the networking of data centers.

While defining Topspin’s architecture, we drew thousands of pictures to describe how customers would use the system. A great deal of pictures, though, had to be scrapped because we didn’t know how to build the software to implement them. Such situations are common for startups, or at least for mine, but this time there was a definite pattern. The pictures in question all involved migrating live connections. For instance, if a server running Apache has a fan failure we can assume that within a short time the server will fail thanks to the very efficient heaters produced by Intel. With our flexible network fabric we hoped we could migrate the connections from that server to another instance of Apache. It’s a simple matter to stop an application, move the connection and restart all the software associated with it. However, we wanted to migrate connections live and despite being able to move a connection through our fabric, the software installed on the end systems wasn’t capable of dealing with dynamic migration.

Hence, the root of our probem was that PC server software stacks, whether Linux or Windows, were built on the fundamental assumption that they had physical hardware beneath them and that they had complete control of that hardware. As a result, without a new layer in the operating software stack, connections and software remained bound to hardware. Of course, not all software systems have this feature; mainframe software has had a built in assumption of virtualization for many years. Looking back now, I can see that I was afflicted with a disease common to networking professionals - we avoid touching the software on the end systems. Because of this we’re always trying to infer what a packet or connection is and what to do with it based solely upon what we can snoop off the wire. I helped build many systems that succesfully did this, such as load balancers and firewalls, but ultimately it’s a limitted approach. And getting more limmitted all the time.

Enter VMware

Virtual machine technology has convinced me that snooping the wire is a dead end. After all, when each wire can have dozens of virtual servers at the end, the real network isn’t on the wire any longer. Once a packet hits the wire it’s old news. VMware was in it’s infancy when we started Topspin so we didn’t have access to virtual machine technology. If we had, all our scrapped pictures might have made more sense.

In fact, that’s precisely what happened in late 2004 when I met with the founders of 3TERA (link). They were investigating building a shared memory system to allow scaling a server beyond one physical server; what’s commonly known as a single system image. Xen 2.0 had just been released and they were testing it, so it wasn’t long before our discussions turned to what virtual machines could enable. In fairly short order we were drawing pictures I recognized - they were the same pictures I’d drawn almost half a decade earlier.

Virtual machines as implemented by VMware and Xen provide an excellent abstraction layer between the operating system and the hardware. All the hardware. In fact, the abstration is so good that a virtual machine can be suspended and restarted or migrated to different physical resources with full state integrity. If you’re paying close attention you’ll note the foregoing is only true if the network connections remain valid, but as I’ve noted I’m already comfortable migrating connections live. Therefore, virtual machines were the last building block missing before someone could truly begin building a commodity utility computing service.

Of course, virtual machines by themselves don’t enable utility computing, despite what some vendors would have you believe. As a developer you can’t arbitrarily drop a set of virtual machines on a grid and have an application up and running.

Moving forward again

The next step is a system that understands the definition of the infrastructure your application needs and can create that infrastructure dynamically on a grid before starting virtual machines. Such systems are coming sooner than the pundits would have you believe, and utility computing will finally fulfill its promise.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

This blog is powered by WordPress running on AppLogic standard LAMP cluster.   RSS feed