The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Thursday, May 22, 2003

The Google Cluster

Few Web services require as much computation per request as search engines. On average, a single query on Google reads hundreds of megabytes of data and consumes tens of billions of CPU cycles. Supporting a peak request stream of thousands of queries per second requires an infrasturcutre comparable in size to that of the largest supercomputer installations. Combining more than 15,000 commodity-class PCs with fault-tolerant software creates a solution that is more cost-effective than a comparable system built out of a smaller number of high-end servers.

Via the Google Weblog, Web Search for a Planet: The Google Cluster Architecture

This is a good introduction to the Google Cluster, the 15,000 machines (as of the writing of this paper I'm sure) that make up the Google website and give it its incredible performance.

One of the ways they do this is have a series of clusters (of a few thousand machines) located around the world to handle queries more or less locally; I did a DNS query from the Facility in the Middle of Nowhere for and got and a DNS query from a machine in Boston gave the result of Some other interesting aspects: they forgoe hardware reliability in favor of software reliability, they don't use the fastest hardware available but the ones that give the best price/performance ratio, and lots of commodity hardware.

The paper doesn't go into deep technical details, but it does give a nice overview of how their system is set up.

Obligatory Picture

[It's the most wonderful time of the year!]

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site:, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2019 by Sean Conner. All Rights Reserved.