Why adding crypto to gopher isn't that easy

Sunday, March 31, 2019

I'm talking about the fact that my hypothetical new protocol operated strictly over TLS connections - plaintext straight up wasn't an option. I am of the opinion that the widespread lack of encryption in gopherspace today is the protocol's biggest shortcoming, and I actually suspect that this point alone discourages some folks who would otherwise be on board from adopting it. …

We only need to add ubiquitous encryption to gopher to end up with the best of both worlds!

Now, let me be clear exactly what I mean by "adding encryption to gopher". I don't want to advocate anybody serving anything on port 70 which isn't backward compatible with standard gopher, because that would be a tragedy for the gopher community. And I also don't want plaintext gopher to disappear entirely, because it's great that something like gopher exists which can be utilised on 40 year old machines which are too slow to do effective crypto. What I would like is to see something new which is basically "gopher plus crypto, maybe a little more" appear alongside the existing options. Something which could be thought of as a "souped up gopher" or as a "stripped down web", depending on your perspective. Something which meant people weren't forced to choose between two non- overlapping sets of massive and obvious shortcomings but could just USE the internet for sharing static content in a non-awful way - whether that static content is "just" phlog posts, ASCII art or old zines, or whether it's serious political dissent, cypherpunk activism, sexually explicit writing or non-trivial free software development.

Why gopher needs crypto

As I metioned before, adding TLS to gopher is relatively straightforward, as it can be added between the TCP layer and the gopher layer with both clients and servers. That's the trivial part. The next step is to register the gophers: URI scheme, and to register a default TCP port number for the gophers: URI. This too, is trivial (it just needs to be done once by somebody).

What's not so trivial is shoehorning the “secure gopher” into the gopher index file. There's no real place for it. The “gopher index” file is a machine readable file that indicates the contents of a gopher server and it looks something like:

1This is a pointer to another siteHTthis-selects-the-fileHTgopher.conman.orgHT70CRLF
0About this siteHTabout-site.txtHTexample.comHT70CRLF
gOur LogoHTlogo.gifHTexample.comHT70CRLF
IOur office spaceHToffice.jpgHTexample.comHT70CRLF

The first character of each line is an indication of what to expect when retrieving the data, a “1” indicates a gopher index file (an example of which you see above), a “0” is a text file, a “g” is a GIF file and “I” is for other image types. This is followed by a human readable description meant to be displayed, followed by a “selector,” followed by the hostname and port number. There is nothing there to indicate “use TLS.” A flag could be added past the port value to indicate “use TLS when retrieving this data” but:

old gopher clients won't see the flag (most will just ignore it) and try to connect without TLS—at best, the client just errors out and at worst, crash;
it breaks gopher clients that actually use the enhansed gopher protocol (alternative link)—they will either error out at best, or at worst, crash.

One solution is to just say “okay, port 70 is plain text, any other port is TLS” but again, we're back to the problem with older gopher clients that don't understand this. Another solution that would work is to assign new “gopher types” (the “0”, “g”, “I”, “1” etc.) for a TLS connection. That just involves picking about two dozen new characters. Old gopher clients will ignore “types” they don't understand and new clients can use TLS. Unfortunately, it does mean that information about the connection type leaks out. HTTP doesn't have this problem because the http: (and https:) URI does not include what the link is, unlike the gopher: URI (or the linking information in the gopher index file) which does include what the link is.

So I think it comes down to picking your poison:

potentially breaking old client;
doubling the number of “types” to support;
or even a new type of protocol entirely, but then you start falling into HTTP territory …

Update on Tuesday, September 28^TH, 2021

There might be a fourth way, but I think it's a hack.

Update on Monday, Debtember 6^TH, 2021

There are more than four ways to do this, but I don't think any are worth implementing.

The Boston Diaries

Sunday, March 31, 2019