The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Friday, Debtember 08, 2023

The Gopher Situation, part II Unicode Booglaloo

The lead I thought I had was a red herring. I thought it may have had something to do with reporting errors back to the client as seen from the logs:

Dec 04 21:44:38	daemon	info	71.19.142.20	gopher	maxco=1 runco=0 toq=0 sc=1 mem=3012577
Dec 04 21:46:23	daemon	err	71.19.142.20	gopher	stat("/home/spc/gopher/share/MGLNDD_71.19.142.20_70") = No such file or directory
Dec 04 21:46:23	daemon	info	71.19.142.20	gopher	remote=XXXXX­XXXXX­XXXXX status=false request="MGLNDD_71.19.142.20_70" bytes=68
Dec 04 21:49:38	daemon	info	71.19.142.20	gopher	maxco=2 runco=0 toq=0 sc=1 mem=2903218
Dec 04 21:50:52	daemon	info	71.19.142.20	gopher	remote=XXXXX­XXXXX­XXXXX status=true request="CONNECT api64.ipify.org:443 HTTP/1.1" bytes=562
Dec 04 21:54:38	daemon	info	71.19.142.20	gopher	maxco=2 runco=0 toq=0 sc=1 mem=3035838

Notice how maxco (total number of coroutines) increments and stays that way after a failed request. And the evidence was pretty convincing too:

Dec 04 22:19:38	daemon	info	71.19.142.20	gopher	maxco=2 runco=0 toq=0 sc=1 mem=3411531
Dec 04 22:23:44	daemon	info	71.19.142.20	gopher	remote=XXXXX­XXXXX­XXXXX status=true request="Phlog:2010/03/08" bytes=189
Dec 04 22:24:38	daemon	info	71.19.142.20	gopher	maxco=2 runco=0 toq=0 sc=1 mem=3185119
Dec 04 22:24:39	daemon	info	71.19.142.20	gopher	remote=XXXXX­XXXXX­XXXXX status=false request="\3\0\0/*?\0\0\0\0\0Cookie: mstshash=Administr" bytes=82
Dec 04 22:25:57	daemon	info	71.19.142.20	gopher	remote=XXXXX­XXXXX­XXXXX status=true request="Phlog:2006/11/19.1" bytes=1028
Dec 04 22:29:38	daemon	info	71.19.142.20	gopher	maxco=3 runco=0 toq=0 sc=1 mem=3242133
Dec 04 22:33:31	daemon	info	71.19.142.20	gopher	remote=XXXXX­XXXXX­XXXXX status=true request="Phlog:" bytes=978
Dec 04 22:34:38	daemon	info	71.19.142.20	gopher	maxco=3 runco=0 toq=0 sc=1 mem=3207881

Error reporting with the gopher protocol is clearly an afterthought. The official RFC has two occurances of the word “error” in it—and one of them is redundant. I did read somewhere (that's difficult to find now) that perhaps gopher should simple close the connection upon an error instead of sending an “error” to the client, so I thought I would try that. Instead of sending:

3Selector not foundHTfooHTgopher.conman.orgHT70CRLF

I would just close the connection.

That didn't work. The gopher server was still getting stuck. Attaching gdb to the stuck process didn't show anything, as the Lua executable I was using didn't have debugging symbols. So then I recompiled Lua and the modules used that were written in C to include debugging information and restarted the server, yet again.

So now I think I think I found the root issue. Attaching gdb this time showed the server was stuck in LPEG. Even better, I could see the text it was trying to parse and well … previously I said, “[t]he code hasn't changed since April.” That's not quite true. The server code hadn't changed since April, but an extension had! Back in late October I modified the code that renders my blog on gopher to use Unicode combining characters to do some typographical tricks, and it seems that the code used to wrap the text just … wasn't up to par (Unicode is hard! Let's go to Mars!).

I also noticed that my Gemini server had finally crashed—hard. And I changed that too, to use Unicode typographical tricks. So out it comes!

Let's see if this was the problem.

Update on Tuesday, December 19th, 2023

It totally was the problem.

Obligatory Picture

An abstract representation of where you're coming from]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.