Jason’s interviewed in ACM Queue

I’m in the January/February 2008 edition of ACM queue. It’s a conversation between Bryan Cantrill and myself about a number of things including virtualization, Facebook applications, Ruby on Rails and what data store backends should look like by the time we hit the year 2037.

The PDF of the print copy is here.

Thank you to everyone involved and for the rest of the Fishworks guys and gals for sharing their office for that morning.

Fermions, Bosons and the 6 Utilities

When I used to teach university chemistry, I’d always start with the statement:

The universe (at one level) is made of two things and two things only: fermions and bosons.

Fermions are the things that have “stuff”: they have mass and can be charged (or not). Bosons are the things that have no “stuff”: they do not have mass nor do they have charge. Bosons in many ways are the things that move fermions. This comes from Quantum Mechanics, where we see that Fermions have spins of +1/2 or -1/2, and Bosons have a spin of 1. This the baseline and the binary division is given to us by the Standard Model.

We also already had an understanding of this division: Fermions are matter, and Bosons are energy. Matter is the stuff of the universe, and energy moves matter.

A simple, appealing, mutually exclusive, yin-and-yang description of things. I don’t mind things that end up being in powers of 2 or 10, or form a nice little tree.

I like to think we have a similar division in compute utilities: things that take up space (Fermions/Matter) and things that move or are the movement of stuff (Bosons/Energy).

Conceptually I group them as

The fermions

1) CPU space
2) Memory space
3) Disc space

The bosons

4) Memory bus IO
5) Disc IO
6) Network IO

This in my mind forms the 6 Utilities that we must have fine-grained, differential controls and metrics on in a “cloud computer” that fairly serves many people. We have to understand the possible minimum and maximum values, and we have to figure out how to balance them all with real workloads. These are the prerequisites that we watch, measure and learn from so we can ask and answer questions such as “How do I pair together one customer that’s CPU-intensive and another that’s disc IO-intensive and have the sum appear just like a single, well performing CPU– and Disk IO-intensive application?”.

The reality is that most operating systems still don’t have a complete set of tools around the 6 Utilities in terms of resource management, QoS (quality of service), virtualization and teasing these apart in a way that serves a number of people sharing physical resources. Operating systems still are basically for a single person using a single “computer” at a time, and there’s real challenges around saying that we should just use BIG servers and divvy them up. There’s even challenges around many cores and lots of RAM.

I wonder if we can in fact have a single purpose operating system that serves both the single user and the “cloud”, and based on the work we’ve been doing with OpenSolaris, I’d say “No”.