The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Thursday, August 31, 2000

Greece in three lines or less

Last month (July 21st to be precise) I received an email from my cousin from Michigan wanting to know if he got the right Sean Conner. He did and I immediately wrote back to him.

Today he replies back with a three line reply informing me of his three week vacation in Greece.

Not that I expected him to respond to email while on vacation, but during the week prior to leaving I would expect someone to check their email at least once a week. Especially since he's been using computers about as long as I have.

Over fifteen million pages right here …

The publicly indexable web contains an estimated 800 million pages as of February 1999, encompassing about 15 terabytes of information or about 6 terabytes of text after removing HTML tags, comments, and extra whitespace.

Accessibility and Distribution of Information on the Web [Steve Lawrence, Lee Giles, NEC Research Institute]

I've been thinking recently about the definition of a webpage (only because the work I've done may redefine what people consider a webpage. Maybe. We'll see). A quick scan of Conman Laboratories revealed 234 files that constitute what is commonly called a webpage. 234 pages is something like 0.0000003% of the indexed web (as of February 1999). Not a significant portion.

But that's only the part you see under It took awhile to calculate, but has 15,620,753 pages. Yup. A lowly 486SX-33 is serving up over fifteen million pages, which works out to be almost 2% of the indexed web.

That is, if it was indexed.

But still, fifteen million pages isn't anything to sneeze at. Even more amazing is that these fifteen million pages only consume something like 5M of disk space. Uncompressed. Not bad for a bunch of two bit pages, eh? (That's a joke. A rather bad joke based upon simple math but anyway … )

Basically, those 15,620,753 pages are nothing more than 15,620,753 partial ways of viewing one single work, the King James Bible. There isn't anything else comparable to it on the web.

Sure, there are online bibles were you can pull out a verse, chapter or book, but none that I know of allow you to arbitrarily select which portions to read [1], which starts to stretch the definition of what a webpage actually is.

And for the record, one of the “pages” is a file telling the various search engine indexers not to index these pages.

But it could be more …

Technically, I don't allow any arbitrary portion of the King James Bible, otherwise I would be serving up 483,682,754 pages (which, if it was completely indexed, would constitute over 50% of the indexed web). There are reasons, mostly pragmatic reasons (it is a 486SX-33 after all) why I disallow purely arbitrary sections.

Just for old times sakes

Mark, Kelly and I, along with JeffC (a client of mine) and John the paper millionaire of a dotcom, ended up seeing Crazy Fingers for the first time since John quit the band.

It wasn't nearly as crowded as the last few times I've seem him play. And the band just sounded different, even though the only lineup difference was the keyboardist.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site:, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.