Science without theories?

Filed under: Science — barmijo — June 29, 2008 @ 1:08 pm

Kevin Kelly has a post on The Technnium about whether google sized data sets will lead to a new way of doing science:

“There’s a dawning sense that extremely large databases of information, starting in the petabyte level, could change how we learn things. ”

Kevin’s post was pointed out to me by Jonah Stein and caught my attention because a couple AppLogic users are starting to build very large data sets and as a result we’re ocassionaly asked about how to access and distribute them. However, the post really is less about the technology of dealing with these data sets and more about whether they’ll change the way science is conducted, which is another subject I’m interested in so I read on. Kevin’s post was inspired by a Wired cover story “The End of Theory” by Chris Anderson who writes:

“There is now a better way. Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.”

So, Chris’ theory is that if we have sufficiently large data sets we’ll be able to act merely on the finding of correlation rather than waiting to understand the actual mechanism that relates the data - the cause and effect. However, IMHO data without understanding isn’t knowledge. In fact, it can be dangerous because the conclusions reached through statistics are very susceptible to influence by what data is collected.

In the 70’s Harvard medical school researchers had reams of data correlating breast feeding infants with juvenile cancer. Simply accepting the statistics could have lead to some horrible decisions. Fortunately, they didn’t accept the statistics and further research showed breast feeding doesn’t cause cancer, but that carcinogens in the mother’s environment and diet were being passed through to the baby.

The Eugenics movement a century ago was based on statistics correlating skull shape of different races with intelligence test scores. This psuedo-science was in part responsible for Nazi atrocities. Of course, we understand now that the intelligence tests were geared towards white Europeans, but without that knowledge, the statistics seemed compelling to people of that era.

Even the low birthrate issues in Europe today could provide miscues if you just accept the statistics. Is the cause that people are less religous than in previous generations? Is it that government subsidies are higher than other countries, or that they’re lower than 50 years ago? Is it pollution or economics? There are statistics to prove each of these.

In pure mathematical sciences huge data sets will provide scientists with interesting insights on where to focus their research. Used properly they will provide a sort of short cut to new theories to test and those theories will provide feedback into the system on what data to collect in the future.

Off to LT Pact

Filed under: Events — barmijo — June 25, 2008 @ 7:00 am

I’m headed to LT Pact for the next couple days, Layered Tech’s customer gathering in Vegas. Layered has been a close partners and puts on a great event, with folks from all over the world interested in delivering web applications. Most, of course, host with Layered, but not all. It’s an open event, registration is free, and you don’t have to be a customer to attend.

The only downside is this means I’m in an airport AGAIN, for the third time in as many days, enduring that infinitely un-American intrusion into my life that is TSA. This morning’s line extended more than 100 yards outside the terminal itself. If either presidential candidate would commit to ending this assualt on our civil liberties they’d have my instant support. Any takers???

Toward a cloud computing standard

Filed under: Cloud Computing — barmijo — @ 6:49 am

Last year in a flurry of blog posts, there were several folks pushing to start work on a cloud computing standard. At the time I responded that I thought it was too early, because the realm of possibilities needed to be explored further. Plus, IMHO the best standards don’t document what is, but provide a framework for future work.

Well I’m happy to say that I think the time has come when we have enough companies in the space working on creative products and services that a standard can progress productively. We’ve begun to share our vision for what that standard can achieve, it’s called Cloudware, and covers not only AppLogic but a whole new way to approach infrastructure.

Over the next couple months I’ll be reaching out to companies that may be interested in participating in a standards effort as well as looking into what the right venue is for the work. I’ve worked on a few standards in my career (SCSI, QIC, 1000BASE-X, 10GBASE-X, Infiniband) and have found that truly open bodies that foster broad participation create the best standards, though they require the most work. For this effort I’d love to see not only the folks who label themselves cloud computing today, but also data center operators, networking vendors, server vendors, ISVs and folks like APC who deal in power infrastructure. If you’re interested, please email and I’ll keep you up to date.

Velocity Conference in SF

Filed under: Cloud Computing, Events — barmijo — June 23, 2008 @ 12:11 pm

A few of us from 3tera will be at the  O’Reily’s Velocity Conference in San Francisco for the next couple days. In fact, this will be a pretty busy week. In addition to  Velocity, we’ll be presenting at Cloud Camp and LTpact, and will be attending the Structure 08 conference as well.

Dennis Barker talks cloud computing with 3tera

Filed under: Cloud Computing — barmijo — @ 11:14 am

I talked with Dennis Barker last week about Cloudware and he did a nice write up in GRIDtoday covering the work 3tera’s doing to open cloud computing up. It well worth a read.

Agathon Group becomes 3tera partner

Filed under: AppLogic — barmijo — June 20, 2008 @ 10:50 am

Our latest partner, Agathon Group, has begun offering AppLogic based services and they’ve put an innovative portal in front of the system so customers can sign up online and get down to business right away. A few partners have discussed creating such a portal, but Agathon is the first to put one in production. Plus, not only can you select the service, but using sliders you can specify the amount of cpu, memory and storage. It’s quite well done and worth a look.

Also, in addition to grid based virtual servers and virtual data centers, Agathon is offering complete scalable LAMP and Ruby stacks; something quite a few rusers have requested so I’m sure they’ll do well with it.

John Willis kicks aaS!!!

Filed under: Cloud Computing — barmijo — @ 10:50 am

John Willis posted a satirical kick in the aaS at the myriad of folks trying to reduce cloud computing to X-as-a-service.
It’s a must read for anyone following the cloud computing space.

New 3tera team members

Filed under: 3tera — barmijo — June 19, 2008 @ 11:24 pm

It’s been WAY too long since I’ve found time to write. If you ever start a company, be prepared - a packed calendar is a requirement for success.

Since the last time I posted we’ve added 3 new members of the 3tera team:

Sean Mulvaney came on board last month as an account manager. Sean’s spent years in the software business and is quickly ramping up to speed bringing new users into the cloud.

Shubham Gupta joined the engineering team as a summer intern and got dropped right in the middle of the Windows build out.

Joseph Dempsey is now an integral part of our support operations. He’s got many years of Linux and Windows operations experience and if you use AppLogic you’re sure to encounter Joseph posting on the forums.

We’re growing quickly and have several more openings in support, engineering, operations and sales. Care to take a walk in the cloud?

What kind of cloud are you using?

Filed under: Cloud Computing — barmijo — @ 11:03 pm

Alistair Croll has an interesting post on gigaom’s refresh the net about understanding the various types of cloud computing that’s worth a read. He tries to break down cloud computing along two axis, whether you get to decide what software your run or the service provider does, and where the resources are located. He ends up with two classifications, development clouds where the provider selects the stack, and operations clouds where you select your software.

IMHO, though, I think Alistair has fallen into a trap laid for him by dozens of other bloggers and vendors - accepting the idea that anything run outside your data center is cloud computing. This notion, which started with folks relabling SaaS as cloud computing, eventually lead to the explotion of XaaS acronyms.

I have to admit, it’s been easy to do fall into this pattern, and I’ve even caught myself doing it. At the root of this confusion I believe is that many new cloud computing services cropping up today are really built from old infrastructure. On one hand, several companies have tried to copy EC2 by offering virtual machines provisioned through an api. On the other hand many services have cropped up offering hosted platforms, essentially shared software stacks deployed as a cluster.

With the proper technology, this type of tradeoff isn’t needed. Agathon Group, a new data center partner of 3tera, is using AppLogic to offer not only both virtual machines, and prebuilt software stacks, but also full virtual data centers. And, of course, they’re doing all this from a single physical infrastructure.

For cloud computing to truly succeed, requires real technology innovation. As new services come out the difference will become clear and the terminology confusion we’re experiencing today will subside.

This blog is powered by WordPress running on AppLogic standard LAMP cluster.   RSS feed