The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Saturday, April 30, 2022

Musings on processing malformed Gemini (and web) requests

I'm still bothered with Gemini requests like gemini://gemini.conman.org//boston/2015/10/17.2. I thought it might be a simple bug but now I'm not so sure. There's a client out there that has made 1,070 such requests, and if that was all, or even most, of the requests, then yes, that's probably a simple bug. But it's not. It turns out to be only 4% of the requests from said client are malformed in that way. Which to me indicates that something out there might be generating such links (and for this case, I checked and I don't think I'm the cause this time).

I decided to see what happens on the web. I poked a few web sites with similar “double slash” requests and I got mixed results. Most of the sites just accepted them as is and served up a page. The only site that seemed to have issues with it was Hacker News, and I'm not sure what status it returned since it's difficult to obtain the status codes from browsers.

So, I have a few options.

  1. I can keep the current code and always reject such requests. In my mind, such requests have no meaning and are malformed, so why shouldn't I just reject them?
  2. I can send a permanent redirection to the “proper” location. This has the upside of maintaining a canonical link to each page, but with the downside of forcing clients through an additional request, and me having to live with the redundant requests in the log files. But it's obvious what resource is being requested, and sending a permenent redirect informs the client of the proper location.
  3. I can just silently clean up the request and carry on. The upside—clean logs with only one request. The downside—two (or more) valid locations for content. On the one hand, this just feels wrong to me, as technically speaking, /foo and //foo should be different resources (as per Uniform Resource Identifier: Generic Syntax, /foo and /foo/ are technically different resources, so why not this case?). On the other hand, this issue is generally ignored by most web servers out there anyway, so there's that precendent. On the gripping hand, doing this just seems like a cop out and blindly following what the web does.

Well, how do current Gemini servers deal with it? Pretty much like existing web servers—most just treat multiple slashses as a single slash. I think my server is the outlier here. Now the question is—how pedantic do I want to be? Is “good enough” better then “perfect?”

Perhaps a better question is—why am I worrying about this anyway?

Obligatory Picture

An abstract representation of where you're coming from]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

No AI was used in the making of this site, unless otherwise noted.

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.