The Boston Diaries

Monday, August 13, 2007

“You are in a twisty maze of passages, all alike … ”

Accordingly, this paper analyzes previously unpublished files recovered from a backup of Woods's student account at Stanford, and documents an excursion to the real Colossal Cave in Kentucky in 2005. In addition, new interviews with Crowther, Woods, and their associates (particularly members of Crowther's family) provide new insights on the precise nature of Woods's significant contributions. Real locations in the cave and several artifacts (such as an iron rod and an axe head) correspond to their representation in Crowther's version …

Via Flutterby, Somewhere Nearby is Colossal Cave: Examining Will Crowther's Original “Adventure” in Code and in Kentucky

I remember first “playing” this game in 5^th Grade (where “playing” involved the teacher reading the descriptions off the only Apple ][ in the school, asking us elementary students what to do next, and typing in the directions given) but it was later, in high school, when my friend Bill and I would play this game for hours (on his family's IBM PC), and I can still picture the map Bill made as we made our way through the game.

What's interesting about this report are the actual photographs from the cave system that show up in Adventure.

Fortunately (or unfortunately, depending upon your take), there are no pictures of a grue—I guess the flashbulb scared him off.

Ramblings about search engine optimizations and bandwidth utilization

For the past week or so, I've been playing around with search engine optimizations (that last link is so I know what not to do) and poring through the log files.

The last time I made a major search engine optimization to my site was four years ago, and the reason for that optimization was to get rid of the disturbing search requests that were plaguing the log files (and my mind) at the time. It also had the added benefit of reducing the amount of “duplicate content” on my site. A search engine like Google would skip indexing the monthly archives (as well as the front page) but would index the individual entries. The end result: no more disturbing search requests, and better results for people actually looking for stuff.

But it didn't reduce all the duplicate content. There was still the small problem of /2000/1/1.1 having the same content as /2000/01/01.1 (note the leading zeros). Technically, they are two separate pages, each with a unique URL, although internally, the leading zero is ignored by my blogging engine and it would happily serve up the page under either location.

Now, that particular duplicate content issue is something I've known about since I started writing mod_blog and I had code to distinquish between the two requests, but never wrote the code to do anything about it. Until last week. Now, go to /2000/1/1.1 and you'll get a permanent redirect to /2000/01/01.1. This change should further reduce the amount of “duplicate content” on my site, as well as reduce the number of hits from web spiders indexing my site (although the redirection doesn't happen under a very unique condition, but fixing that pretty much requires a complete overhaul of some very old code, but it's such a seldom used bit of code that I'm not terribly worried about it).

I'm a bit concerned about the spiders because of some other information I've pulled out from the log files. My archive of log files (at least, of this blog) go back to October of 2001 and using some homegrown tools, I generated (with the help of GNUPlot) this graph of the growth of my site over the past six years:

[Graph of traffic growth at The Boston Diaries]

In red, you see the number of raw hits to this site (with the scale along the left hand side), with some explosive growth in early 2006 and again in just the last few months here. In green you see the actual bytes transferred (with its scale along the right hand side)—pretty steady up until January of 2006 when it goes vertical, and again it goes vertical in just the past few months.

And I'm at a loss to the sudden explosion of bandwidth usage in my site. Unless it's a lot of people hot linking to images on this site (and yes, that does happen quite often), or a vast increase in the number of spiders indexing my site (and for the past few months, Yahoo's Slurp has been generating about 40,000 hits a month).

I may no longer have disturbing search requests, but I know have a disturbing use of bandwidth.

Monday, August 13, 2007

“You are in a twisty maze of passages, all alike … ”

Ramblings about search engine optimizations and bandwidth utilization

Obligatory Picture

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer