HTTP FTW

I was reading “The Web Is Dead. Long Live the Internet” today and a good response on gigaom.

To a number of people, “the web” = HTTP. And HTTP as a protocol on the Internets has clearly won.

The fun thing to notice is that “the web” to Chris Anderson and Michael Wolff is just content delivered on a web site. Video and peer-to-peer is still on average occurring over what protocol?

That’s right!

HTTP.

So while content is diversifying on his web, HTTP is clearly the winner.

The “machine” needs to die

I was reading this.

The “computer machine” as our base unit of work is a shitty unit.

What I typically want is

  1. – Agility and flexibility
  2. – Performance and scale
  3. – Business continuity and taking a resource pricing point of view for dev, test, staging and DR.
  4. – Business and security best practices baked into infrastructure

You can do agility and flexibility with virtual machines. But that’s it.

Virtual “machines” suffer from the same fundamental problems as “physical machines”.

1) VMs still take up space just like PMs and the space they take up is additive. A machine is a machine. Whether logical or physical. You cannot do business continuity, dev and test for the cost of production. It’s normal to figure out what a piece of software needs for production and then buy 20x all at once that to account for everything one might need. This is for an application that will not need to scale beyond the initial deployment and it’s clear to see why one would end up at 5% utilization on average. VMs are not in line with the idea of having accessible software fully utilize a server.

2) Performance and scale can not, will not and are not a unique feature of a pure VM approach. They can’t be. No more than a square peg’s inability to fit into a round hole. The same wall that you hit with physical machines, you will hit it with virtual machines. Except. You. will. hit. it. sooner. It you’re not hitting it, you’re not big enough, so maybe don’t worry about it: you’re likely just concerned about agility and flexibility.

You don’t buy CDN services by the “VM”. We need to move what we’re doing with compute to a utility around a transaction, around the usage of a specific resource. Everything else needs to be minimized.

To be clear about the problem and to leave you with some food for thought. I can take two racks of servers with two 48-port non-blocking 10 Gbps at the top of each, and then write a piece of software that will saturate the lines between these racks.

Can someone name a web property in the world that does more than a Tbps?

Can someone name one that gets close and only uses 20 servers for it’s application tier?

We have massive inefficiencies in our software stacks and hardware has really really really really outpaced what software can deliver. And the solution is what? More layers of software abstractions? More black boxes from an instrumentation point of view? Odd.

But familiar.

Oracle and OpenSolaris: A Kernel of Truth

@nevali on Twitter asked a question that we’ve heard from many customers, so I’m writing a response to everyone, though none of you need to worry. His question is, “As a long time Open Solaris stalwart, I do wonder what @Joyent’s perspective on the post-Oracle-takeover world is.”

In many ways, we’re happy to have seen Oracle and Sun combine. Sun was a great company for technologists and Oracle is tremendously good at operating a business. Oracle may prove to be the management team that can turn around Sun’s fortunes. And I think they’re completely committed to the Solaris kernel.

A lot of people think of OpenSolaris™ when they think of Joyent, and that’s reasonable — since it’s the most well known open source distribution of the Solaris Kernel. But in truth, Joyent has never used OpenSolaris™. OpenSolaris™ is a full operating system, a “distribution” containing both a kernel and a userland (along with packaging tools), the name a marketing term used to refer to this full distribution. There are a number of features in there that we’ve simply never cared about: For instance, we have no need to allow laptops to sleep. Since 2005, Joyent has been using the open source Solaris 11 kernel, a couple of binary bits and combining it with a Solaris build (that we maintain) of NetBSD’s pkgsrc. Combining a BSD set of tools with the rock solid Solaris gave us a foundation that contained the best of both worlds and allowed us to have a functional userland while having access to DTrace, ZFS and Zones.

So given Oracle’s commitment to the Solaris kernel, and the way we’re using it in SmartOS, we’re actually very well aligned with Oracle. Also, we’ve been working and will continue to work to make our base operating system a completely open operating system, and we are aligned with and believe in the vision behind the Illumos project.

If you have any particular questions, comments or concerns in this area, please feel free to let me know directly at jason@joyent.com and I’ll make sure they get addressed.

Yahoo Post: “Multi-Core HTTP Server with NodeJS”

The Yahoo! Developer Blog has a nice post about node.js on how they’re running node.js

A good comment on news ycombinator:

Node.js lets you write server applications in a server container that can handle tens of thousands of concurrent connections in a loosely typed language like Javascript which lets you code faster. It uses the same design as Nginx which is why it can handle so many connections without a huge amount of memory or CPU usage.

If you were to do this on Nginx you’d have to write the module in C.

You can’t do it on Apache because of Apache’s multi-process/thread model.

The fact that you can write a web server in a few lines of easy to understand and maintain Javascript that can handle over 10,000 concurrent connections without breaking a sweat is a breakthrough.

Node.js may do for server applications what Perl did for the Web in the 90’s.

EMC bought Greenplum

EMC said today that it will acquire private data warehousing company Greenplum in an all-cash transaction, though the terms of the deal were not released. It said that Greenplum will “form the foundation of a new data computing product division within EMC’s Information Infrastructure business.”

It’s no secret that digital data is on the rise, both on business and consumer levels. EMC called Greenplum a visionary leader that utilizes a built-from-the-ground architecture for analytical processing. In a statement, Pat Gelsinger, President and Chief Operating Officer of EMC’s Information Infrastructure Products, said:

The data warehousing world is about to change. Greenplum’s massively-parallel, scale-out architecture, along with its self-service consumption model, has enabled it to separate itself from the incumbent players and emerge as the leader in this industry shift toward ‘big data’ analytics. Greenplum’s market-leading technology combined with EMC’s virtualized Private Cloud infrastructure provides customers, today, with a best-of-breed solution for tomorrow’s ‘big-data’ challenges.

The company said it expects the deal to be completed in the third quarter, following regulatory approval. It is not expected to have a material impact on EMC’s fiscal 2010 GAAP and non-GAAP earnings.

From this ZDNET article.

I actually think that in 7-10 years, this acquisition by EMC could be as important as their VMWare acquisition. Remember the past was “cloud networking, the presence is “cloud computing” and the future is “cloud data”. Virtualization is not the end-all-be-all of “cloud computing” but it is a component. Think of these types of data stores as an important component in the future of distributed, pervasive data.

On Solaris

Over the years that I’ve been developing on unix platforms I’ve come in contact with quite a few… Linux (from 2.2 upwards), FreeBSD (version 4 upwards), Solaris (8, 9, 10 on both SPARC and x86), OS X, AIX, HPUX and even VMS. Even though I’ve come into contact with these I’ve not really gotten to spend some real quality time with them other than Linux, FreeBSD and OS X.

Since starting at Joyent it was obvious that I was going to be spending quite some time with Open Solaris (nevada) on x86 and there’s several things I’ve come to love:

1) SMF

SMF is the Solaris service managment framework and it maps to the functionality of things like init, launchd and the like. SMF will do automatic service restarting, loading of services in dependency order, provides built-in logging and hooks in with monitoring and notifications. You can add new services easily by creating your own XML manifest and have your own user daemons managed by a fantastic tool. To find out more about SMF, visit Solaris’ Managing Services Overview.

Continue reading →