It was 20 years ago today

Wednesday, Debtember 04, 2019

It's amzing to think I've been doing this whole blog thing for a whole twenty years. When I started, I had been reading several “online journals” for several years and the idea of doing that myself was intriguing. As I have mentioned, the prospect of a temporary job in Boston was enough to get me started, both writing the blog, and the codebase for mod_blog, which was somewhat based on the work I did for The Electric King James Bible.

I recall spending way too much time writing the code, trying to get it perfect and worrying if I should use anchor points to intrablog links or how to automatically generate the archive page and how it should look. After nearly two years, I had enough, did the simplist thing I could and finally released the first version of the code sometime in October of 2001. And for the record, that release of the code did not use anchor points for intrablog links (and I still don't—that was the correct call in retrospect), it didn't bother with automatically generating the archive page (and it still doesn't—I have a separate script that generates it) and this is what the archive looks like today (you can see that 2012 was the year I blogged the least).

I also don't think there's a single line of code in mod_blog that hasn't been changed in the twenty years I've been using it. I know I've done a few major rewrites of the code over the years. One was to merge the two separate programs I had into a single program (to better support the web interface I have, which I think I've used less than 10 times in total), I think one was to put in my own replacement for the Standard C I/O and memory allocation functions (I don't recall if my routines were in place from the start, or I later replaced the standard functions—the early history of the code has been lost in time, like bits in an EMP blast) but I did rip them out years later in another rewrite when I finally realized that was a bad idea. I switched to using Lua for the configuration file (an overall win in my book) and a rewrite of the parsing code meant that the last of the original code was no longer.

But despite all the code changes, the actual storage format has not changed one bit in all twenty years. Yes, there is some additional data that didn't exist twenty years ago, but such data has been added in a way that the code from twenty years ago will safely ignore. I think that's pretty cool.

A few things I've learned having written and maintained a blogging codebase, as well as blogging, for twenty years:

“Do the simplest thing that could possibly work” is sound advice. I was trying to figure out everything when I started writing the code and it turns out half the ideas I wanted would not have been a good idea long term. I was also taking way too long to write the code because of trying to deal with issues that turned out to be non-issues.
The storage format is probably more important than the code. The program can change drastically (and the code today has nothing left in common with the code from twenty years ago) but I don't have to worry about the data. It also helps that everything is stored as text, so I don't have to worry about things like integer length and endianess.
All entries are stored in HTML, and always have been. Markdown didn't exist when I started blogging, and even if it had, I don't think I would have used it (I'm not a fan). By having all my entries in HTML, I don't have to worry about maintaining an ever evolving markup language rendering previous entries unrenderable, or being stuck with a suboptimal markup format because of the thousand previous entries (or even 4,974 entries, as of the time of this entry). It does mean that rendering the blog for a non-HTML platform is a bit harder, but possible (and I'll be talking about this topic more in the near future).
My PageRank is still high enough to get requests from people trying to leach off from it. Partly this is because my URLs don't change, and partly from longevity. But it's also possible because I'm not trying to game the PageRank system (which is a “tit-for-tat” arms race between Google and the SEO industry) and just keep on keeping on.
I gave up on dealing with link rot years ago. If I come across an old post with non-functioning links, I may just find a new resource, link to The Wayback Machine or (if I'm getting some spammer trying to get me to fix a broken link by linking to their black-hat-SEO laiden spamfarm) removing the link outright. I don't think it's worth the time to fix old links, given the average lifespan of a website is 2½ years and trying to automate the detection of link rot is a fools errand (a page that goes 404 or does not respond is easy—now handle the case where it's a new company running the site and all old links still go a page, but not the page that you linked to). I'm also beginning to think it's not worth linking at all, but old habits die hard.
I maintain a list of tags for each entry. It's a manual process (I type the tags for each entry as I'm writing it) and it's pretty much free-form. So free-form that I currently have 9,597 unique tags, which means I have nearly two unique tags per entry. And despite that, I still have trouble finding entries I know I wrote. The tags are almost never what I want in the future, but I just don't know what tag I think I'll need in the future as I'm writing the entry.

For instance, it took me a ludicrously long time to locate this entry because I knew I used the phrase “belaboring the inanimate equus pleonastically” somewhere on the blog, but the tags were useless. The tags for that particular entry were “control panels,” “rants,” and “Unix administration.” The tags “inanimate equus” or “pleonastically” never appeared (although that is being rectified right now in this post). I don't think this issue is actually solvable.
Writing entries still takes longer than I always expect, and it's still not uncommon for me to visit around two dozen web sites to gather information and links per entry (which I'm not sure are worth it anymore, per the point above). I've tried to make this easier over the years, but I'm still not quite happy with how long it takes to write an entry.
I find the following amusing:

Popularity of the various feeds of my blog over the past month

Format #requests last month

JSON 4,093

RSS 3,691

Gopher 3,305

Atom 1,458

Gopher is surprisingly popular.

Popularity of the various feeds of my blog over the past month
Format	#requests last month
JSON	4,093
RSS	3,691
Gopher	3,305
Atom	1,458

So, twenty years of a blog. Not many blogs can say they've been around that long. A few (like Flutterby or Jason Kottke) but not many.

And here's to at least twenty more.

The Boston Diaries

Wednesday, Debtember 04, 2019

It was 20 years ago today

Obligatory Picture

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer