Sunday, March 31, 2019
Why adding crypto to gopher isn't that easy
I'm talking about the fact that my hypothetical new protocol operated strictly over TLS connections - plaintext straight up wasn't an option. I am of the opinion that the widespread lack of encryption in gopherspace today is the protocol's biggest shortcoming, and I actually suspect that this point alone discourages some folks who would otherwise be on board from adopting it. …
We only need to add ubiquitous encryption to gopher to end up with the best of both worlds!
Now, let me be clear exactly what I mean by "adding encryption to gopher". I don't want to advocate anybody serving anything on port 70 which isn't backward compatible with standard gopher, because that would be a tragedy for the gopher community. And I also don't want plaintext gopher to disappear entirely, because it's great that something like gopher exists which can be utilised on 40 year old machines which are too slow to do effective crypto. What I would like is to see something new which is basically "gopher plus crypto, maybe a little more" appear alongside the existing options. Something which could be thought of as a "souped up gopher" or as a "stripped down web", depending on your perspective. Something which meant people weren't forced to choose between two non- overlapping sets of massive and obvious shortcomings but could just USE the internet for sharing static content in a non-awful way - whether that static content is "just" phlog posts, ASCII art or old zines, or whether it's serious political dissent, cypherpunk activism, sexually explicit writing or non-trivial free software development.
As I metioned before, adding
TLS to gopher is
relatively straightforward, as it can be added between the TCP layer and the gopher layer
with both clients and servers. That's the trivial part. The next step is to
register the gophers:
URI scheme, and to register a default TCP port number for the gophers:
URI. This too, is trivial (it just
needs to be done once by somebody).
What's not so trivial is shoehorning the “secure gopher” into the gopher index file. There's no real place for it. The “gopher index” file is a machine readable file that indicates the contents of a gopher server and it looks something like:
1This is a pointer to another siteHTthis-selects-the-fileHTgopher.conman.orgHT70CRLF 0About this siteHTabout-site.txtHTexample.comHT70CRLF gOur LogoHTlogo.gifHTexample.comHT70CRLF IOur office spaceHToffice.jpgHTexample.comHT70CRLF
The first character of each line is an indication of what to expect when retrieving the data, a “1” indicates a gopher index file (an example of which you see above), a “0” is a text file, a “g” is a GIF file and “I” is for other image types. This is followed by a human readable description meant to be displayed, followed by a “selector,” followed by the hostname and port number. There is nothing there to indicate “use TLS.” A flag could be added past the port value to indicate “use TLS when retrieving this data” but:
- old gopher clients won't see the flag (most will just ignore it) and try to connect without TLS—at best, the client just errors out and at worst, crash;
- it breaks gopher clients that actually use the enhansed gopher protocol (alternative link)—they will either error out at best, or at worst, crash.
One solution is to just say “okay, port 70 is plain text, any other port
is TLS” but again, we're back
to the problem with older gopher clients that don't understand this. Another
solution that would work is to assign new “gopher types” (the “0”, “g”, “I”,
“1” etc.) for a TLS connection.
That just involves picking about two dozen new characters. Old gopher clients
will ignore “types” they don't understand and new clients can use TLS. Unfortunately, it does mean that
information about the connection type leaks out. HTTP doesn't have this problem because the
http:
(and https:
) URI does not include what the link is, unlike the
gopher:
URI (or
the linking information in the gopher index file) which does include
what the link is.
So I think it comes down to picking your poison:
- potentially breaking old client;
- doubling the number of “types” to support;
- or even a new type of protocol entirely, but then you start falling into HTTP territory …
Update on Tuesday, September 28TH, 2021
There might be a fourth way, but I think it's a hack.
Update on Monday, Debtember 6TH, 2021
There are more than four ways to do this, but I don't think any are worth implementing.