The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Tuesday, October 01, 2019

It only took 35 years …

The first accurate portrayal of a black hole in Hollywood was in the 2014 movie “Interstellar” with help from theoretical physicist Kip Thorne, and the images from that movie do appear to match reality. But I find it facinating that astrophysicist Jean-Pierre Luminet generated an image of a black hole in April of 1979!

It's sad to think that Disney'sThe Black Hole,” which came out in December of 1979, could have not only been the first Hollywood portrayal of a black hole (which it appears it was), but it could have been an accurate portrayal of a black hole. Ah well …

Wednesday, October 02, 2019

“Night of the Lepus” was based on a book‽

I'm going to lunch with a few cow-orkers and ST is driving. While driving, we're subject to his music listening choices, which tend towards movie and video game scores. As a joke, I mention that he's playing the score to “Night of the Lepus” and to my total surprise, no one else in the vehicle had ever heard of the movie.

So of course I start reading off the plot synopsis from Wikipedia and I'm amazed to learn that it's based on a book! “Night of the Lepus” was originally a book! I then switch to reading the plot synopsis of The Year of the Angry Rabbit and … it sounds amazing! An attempt to eradicate rabbits in Australia leads to world peace through an inadvertant doomsday weapon with occasional outbreaks of killer rabbits.


Why wasn't that movie made?

Friday, October 04, 2019

Back when I was a kid, all I had to worry about was the mass extinction of the human race due to global thermonuclear war

Bunny and I are out eating dinner at T. B. O. McFlynnagin's and out of the corner of my eye on one of the ubiquitious televisions dotting the place, I saw what appeared to be a “back to school” type commercial but one that turned … dark. I'm normally not one for trigger warnings, but this commercial, which did air because I saw it, is quite graphic. So … you have been warned!

It reminds me of the “Daisy” commercial, although it's hard to say which one is worse. Perhaps both of them are.

It's a stupid benchmark about compiling a million lines of code, what else did I expect?

I came across a claim that the V programming langauge can compile 1.2 million lines of code per second. Then I found out that the code was pretty much just 1,200,000 calls to println('hello world'). Still, I was interested in seeing how GCC would fare. So I coded up this:

#include <stdio.h>

int main(void)
  printf("Hello world!\n");
  /* 1,199,998 more calls to printf() */
  printf("Hello world!\n");
  return 0;

which ends up being 33M, and …

[spc]lucy:/tmp>time gcc h.c
gcc: Internal error: Segmentation fault (program cc1)
Please submit a full bug report.
See <URL:> for instructions.

real    14m36.527s
user    0m40.282s
sys     0m17.497s

Fourteen minutes for GCC to figure out I didn't have enough memory on the 32-bit system to compile it (and the resulting core file exceeded physical memory by three times). I then tried on a 64-bit system with a bit more memory, and I fared a bit better:

[spc]saltmine:/tmp>time gcc h.c

real    7m37.555s
user    2m3.000s
sys     1m23.353s

This time I got a 12M executable in 7½ minutes, which seems a bit long to me for such a simple (but large) program. I mean, Lua was able to compile an 83M script in 6 minutes, on the same 32-bit system as above, and that was considered a bug!

But I used GCC, which does some optimizations by default. Perhaps if I try no optimization?

[spc]saltmine:/tmp>time gcc -O0 h.c

real    7m6.939s
user    2m2.972s
sys     1m27.237s

Wow. A whole 30 seconds faster. Way to go, GCC! Woot!

Saturday, October 05, 2019

More stupid benchmarks about compiling a million lines of code

I'm looking at the code GCC produced for the 32-bit system (I cut down the number of lines of code):

 804836b:       68 ac 8e 04 08          push   0x8048eac
 8048370:       e8 2b ff ff ff          call   80482a0 <puts@plt>
 8048375:       68 ac 8e 04 08          push   0x8048eac
 804837a:       e8 21 ff ff ff          call   80482a0 <puts@plt>
 804837f:       68 ac 8e 04 08          push   0x8048eac
 8048384:       e8 17 ff ff ff          call   80482a0 <puts@plt>
 8048389:       68 ac 8e 04 08          push   0x8048eac
 804838e:       e8 0d ff ff ff          call   80482a0 <puts@plt>
 8048393:       68 ac 8e 04 08          push   0x8048eac
 8048398:       e8 03 ff ff ff          call   80482a0 <puts@plt>
 804839d:       68 ac 8e 04 08          push   0x8048eac
 80483a2:       e8 f9 fe ff ff          call   80482a0 <puts@plt>
 80483a7:       68 ac 8e 04 08          push   0x8048eac
 80483ac:       e8 ef fe ff ff          call   80482a0 <puts@plt>
 80483b1:       68 ac 8e 04 08          push   0x8048eac
 80483b6:       e8 e5 fe ff ff          call   80482a0 <puts@plt>
 80483bb:       83 c4 20                add    esp,0x20

My initial thought was Why doesn't GCC just push the address once? but then I remembered that in C, function parameters can be modified. But that lead me down a slight rabbit hole in seeing if printf() (with my particular version of GCC) even changes the parameters. It turns out that no, they don't change (your mileage may vary though). So with that in mind, I wrote the following assembly code:

        bits    32
        global  main
        extern  printf

        section .rodata
                db      'Hello, world!',10,0

        section .text
                push    msg
                call    printf
	;; 1,999,998 more calls to printf
		call	printf
		pop	eax
		xor	eax,eax

Yes, I cheated a bit by not repeatedly pushing and popping the stack. But I was also interested in seeing how well nasm fares compiling 1.2 million lines of code. Not too badly, compared to GCC:

[spc]lucy:/tmp>time nasm -f elf32 -o pg.o pg.a

real    0m38.018s
user    0m37.821s
sys     0m0.199s

I don't even need to generate a 17M assembly file though, nasm can do the repetition for me:

        bits    32
        global  main
        extern  printf

        section .rodata

msg:            db      'Hello, world!',10,0

        section .text

main:           push    msg
        %rep 1200000
                call    printf

                pop     eax
                xor     eax,eax

It can skip reading 16,799,971 bytes and assemble the entire thing in 25 seconds:

[spc]lucy:/tmp>time nasm -f elf32 -o pf.o pf.a

real    0m24.830s
user    0m24.677s
sys     0m0.144s

Nice. But then I was curious about Lua. So I generated 1.2 million lines of Lua:

print("Hello, world!")
-- 1,999,998 more calls to print()
print("hello, world!")

And timed out long it took Lua to load (but not run) the 1.2 million lines of code:

[spc]lucy:/tmp>time lua zz.lua
function: 0x9c36838

real    0m1.666s
user    0m1.614s
sys     0m0.053s


Monday, October 07, 2019

I was working harder, not smarter

Another department at the Ft. Lauderdale Office of the Corporation is refactoring their code. Normally this wouldn't affect other groups, but this particular code requires some executables we produce, and to make it easier to install, we, or rather, I, needed to create a new repository with just these executables.

Easier said than done.

There's about a dozen small utilities, each a single C file, but unfortunately, to get the banana (the single C file) you also need the 800 pound gorilla (its dependencies). Also, these exectuables are spread through most of our projects—there's a few for “Project: Wolowizard” (which is also used for “Project: Sippy-Cup”), multiple ones for “Project: Lumbergh,” a few for “Project: Cleese” and … oh, I never even talked about this other project, so let's just call it “Project: Clean-Socks.”


So that's how I spent my time last week, working on “Project: Seymore,” rewriting a dozen small utilities to remove the 800 pounds of gorilla normally required to compile these tools. All these utilties do is transform data from format A to format B. The critical ones take a text file of lines usually in the form of “A = B” but there was one that took over a day to complete because of the input format:

A = B:foo,bar,... name1="value" name2="value" ...
A = B:none

Oh, writing parsing code in C is so much fun! And as I was stuck writing this, I kept thinking just how much easier this would be with LPEG. But alas, I wanted to keep the dependencies to a minimum, so it was just grind, grind, grind until it was done.

Then today, I found that I had installed peg/leg, the recursive-descent parser generator for C, on my work machine eight years ago.

Eight years ago!

Head, meet desk.

Including the time to upgrade peg/leg, the time it took me to rewrite the utility that took me nearly two days only took two hours (most of the code among most of the utilities is the same—check options, open files, sort the data, remove duplicates, write the data; it's only the portion that reads and converts the data that differs). It's also shorter, and I think easier to modify.

So memo to self: before diving into a project, check to see if I already have the right tools installed.


Tool selection

So if I needed to parse data in C, why did I not use lex? It's pretty much standard on all Unix systems, right? Yes, but all it does is lexical analysis. The job of parsing requires the use of yacc. So why didn't I use yacc? Beacuse it doesn't do lexical analysis. If I use lex, I also need to use yacc. Why use two tools when one will suffice? They are also both a pain to use, so it's not like I immediately think to use them (that, and the last time I used lex in anger was over twenty years ago …)

Sunday, October 13, 2019

How many redirects does your browser follow?

An observation on the Gemini mailing list led me down a very small rabbit hole. I recalled at one time that a web browser was only supposed to follow five consecutive redirects, and sure enough, in RFC-2068:

10.3 Redirection 3xx

This class of status code indicates that further action needs to be taken by the user agent in order to fulfill the request. The action required MAY be carried out by the user agent without interaction with the user if and only if the method used in the second request is GET or HEAD. A user agent SHOULD NOT automatically redirect a request more than 5 times, since such redirections usually indicate an infinite loop.

Hypertext Transfer Protocol -- HTTP/1.1

But that's an old standard from 1997. In fact, the next revision, RFC-2616, updated this section:

10.3 Redirection 3xx

This class of status code indicates that further action needs to be taken by the user agent in order to fulfill the request. The action required MAY be carried out by the user agent without interaction with the user if and only if the method used in the second request is GET or HEAD. A client SHOULD detect infinite redirection loops, since such loops generate network traffic for each redirection.

Note: previous versions of this specification recommended a maximum of five redirections. Content developers should be aware that there might be clients that implement such a fixed limitation.

Hypertext Transfer Protocol -- HTTP/1.1

And subsequent updates have kept that language. So it appears that clients SHOULD NOT (using language from RFC-2119) limit itself to just five times, but still SHOULD detect loops. It seems like this was changed due to market pressure from various companies and I think the practical limit has gone up over the years.

I know the browser I use, Firefox, is highly configurable and decided to see if its configuration included a way to limit redirections. And lo', it does! The option network.http.redirection- limit exists, and the current default value is “20”. I'm curious to see what happens if I set that to “5”. I wonder how many sites will break?

Thursday, October 17, 2019

You know, we might as well just run every network service over HTTPS/2 and build another six layers on top of that to appease the OSI 7-layer burrito guys

I've seen the writing on the wall, and while for now you can configure Firefox not to use DoH, I'm not confident enough to think it will remain that way. To that end, I've finally set up my own DoH server for use at Chez Boca. It only involved setting up my own CA to generate the appropriate certificates, install my CA certificate into Firefox, configure Apache to run over HTTP/2 (THANK YOU SO VERY XXXXX­XX MUCH GOOGLE FOR SHOVING THIS HTTP/2 XXXXX­XXX DOWN OUR THROATS!—no, I'm not bitter) and write a 150 line script that just queries my own local DNS, because, you know, it's more XXXXX­XX secure or some XXXXX­XXX reason like that.


And then I had to reconfigure Firefox using the “advanced configuration page” to tweak the following:

Firefox configuration for DoH
variable value
variable value
network.trr.allow-rfc1918 true
network.trr.blacklist-duration 0
network.trr.confirmationNS skip
network.trr.custom_uri https://playground.local/cgi-bin/dns.cgi
network.trr.max-fails 15
network.trr.mode 3
network.trr.request-timeout 3000
network.trr.uri https://playground.local/cgi-bin/dns.cgi

I set network.trr.mode to “3” instead of “2” because it's coming. I know it's just coming so I might as well get ahead of the curve.

Friday, October 18, 2019

A minor issue with DoH

So far, the DoH server I wrote works fine (looking over the logs, it's amazing just how many queries mainstream sites make—CNN's main page made requests to over 260 other sites and that's after I restricted the number of redirects allowed) except for Github. The browser would claim it couldn't find Github, (although the logs said otherwise), or the the page formatting was broken because the browser couldn't locate various other servers (which again the logs said otherwise).

So I dived in to figure out the issue. It turns out the DNS replies were just a tad bit larger than expected. The Lua wrapper I wrote for my DNS library used the RFC mandated limit for the message size, which these days, is proving to be a bit small (that particular RFC was written in 1987). The fix was trival (increase the packet size) after the hour of investigation.

Tuesday, October 29, 2019

I thought computers exist to appease us, not for us to appease the computers

I got an email from the Corporation's Corporate Overlords' IT department about making sure my Windows laptop was on and logged into the Corporate Overlords' VPN so that mumble techspeak mumble technobable blah blah whatever. Even if it's a phishing email that our Corporate Overlords so love to send us, it didn't have any links to click or ask for my credentials. I also need to turn on the computer at least once every three weeks to prevent the Corporate Overlords from thinking its been stolen, so I figured it wouldn't hurt to turn it on, log in and reply to the email.

I turn it on, but I find I'm unable to get connected to the Corporation's wifi network which I need to do if I'm to log onto the Corporate Overlords' VPN. After many minutes of futzing around, I end up telling Windows to forget the Corporation's wifi network, re-select it from a list and re-enter my credentials (which have changed since the last time I logged in due to the outdated password practices still in use at The Corporation). Then I could log into the Corporate Overlords' VPN and reply to the email saying “go ahead mumble technospeak mumble technobabble blah blabh whatever.”

Of course, the Corporation “change your password” period (which was triggered last week) is different from that of the Corporation's Corporate Overlords' “change your password” period (which was triggered today) so there was that nonsense to deal with.

Over the course of the next few hours, I had to restart the Windows laptop no less than five times to appease the Microsoft Gods, and twice more I had to tell the computer to forget the Corporation's wifi netowrk before it got the hint and remember my credentials.

Seriously, people actually use Windows? I'm lucky in that I had the Mac book to keep working.

We are all publishers now

[This gopher site] has had, from very early days, a policy which allows [users] to request that their account be removed and all their content immediately and permanently deleted. This is called "claiming your civil right", … The Orientation Guide explains:

This promise is not a gimmick … It is a recognition that the ability to delete your accounts from online services is an important part of self ownership of your digital identity. This is genuinely an important freedom and one which many modern online services do not offer, or deliberately make very difficult to access.

I have always been, and still am, proud that [this gopher server] offers this right so explicitly and unconditionally, and I have no plans to change it. I really think this an important thing.

And yet, it always breaks my heart a little when somebody actually claims their right, and it's especially tough when a large amount of high- quality gopherspace content disappears with them. As several people phlogged about noticing, kvothe recently chose to leave gopherspace, taking with him his wonderful, long-running and Bongusta-aggregated phlog "The Dialtone" … I loved having kvothe as part of our community, but of course fully respect his right to move on.

As I deleted his home directory, I thought to myself "Man, I wish there was an equivalent for Gopherspace, so that this great phlog wasn't lost forever". A minute later I thought "Wait… that is totally inconsistent with the entire civil right philosophy!". Ever since, I've been trying to reconcile these conflicting feelings and figure out what I actually believe.

The individual archivist, and ghosts of Gophers past

Some of the commentary on solderpunk's piece has shown, of course, divided opinion. There are those who claim that all statements made in public are, res ipsa loquitor, statements which become the property of the public. This claim is as nonsensical as it is legally ridiculous.

By making a statement in a public place, I do not pass ownership of the content I have "performed" to anyone else, I retain that ownership, it is mine, noone elses. I may have chosen to permit a certain group of people to read it, or hear it; I may have restricted that audience in a number of ways, be it my followers on social media, or the small but highly-regarded phlog audience; I may have structured my comments to that audience, such as using jargon on a mailing list which, when quoted out of context, can appear to mean something quite different; I may just have posted a stupid or ill-judged photo to my friends.

In each of those cases, it is specious to claim that I have given ownership of my posts to the public, forever, without hope of retrieval. It is not the case that I have surrendered my right to privacy, forever, to all 7.7bn inhabitants of this earth.

In much the same way, I reacted strongly when I realised that posts I had made on my phlog were appearing on google thanks to that site's indexing of gopher portals. I did not ever consent to content I made available over port 70 becoming the property of rapacious capitalists.

Ephemera, or the Consciousness of Forgetting

Back during college I wrote a humor column for the university newspaper. In one of my early columns (and not one I have on my site here) I wrote a column with disparaging remarks about a few English teachers from high school. Even worse, I named names!

I never expected said English teachers to ever hear about the column, but of course they did. My old high school was only 10 miles away (as the crow flies) and there were plenty of students at FAU who had attended the same high school I did. Of course I should have expected that. But alas, I was a stupid 18 year old who didn't know better.

Now I know better.

It was a painful experience to learn, but things spoken (or written) can move in mysterious ways and reach an audience that it was not intended for.

The copies of the humor column I have on my site are only a portion of the columns I wrote, the ones I consider “decent or better.” The rest range from “meh” to “God I wish I could burn them into non-existance.” But alas, they exist, and I've even given a link to the paper archives where they can be unceremoniously resurrected and thrown back into my face. Any attempt to “burn them into non-existance” on my part would be at best a misdemearnor and worst a felony.

In this same vein, Austin McConnell erased his book from the Internet. He managed to take the book out of print and buy up all existing copies from Amazon. There are still copies of his book out there in the hands of customers, and there's nothing he can do about that. The point being, once something is “out there” it's out, and the creator has limited control over what happens.

I'm not trying to victim shame Daniel Goldsmith. What I am trying to say is that Daniel may have an optimistic view of consumption of content.

As to his assertion that his content via gopher is now “the property of rapacious capitalists”—plainly false. Both Ireland (where Daniel resides) and the United States (where Google primarily resides) are both signatories to the “Berne Convention for the Protection of Literary and Artistic Works” which protects the rights of authors and Daniel owns the copyright to his works, not Google. Daniel may have not wanted Google to index his gopher site but Google did nothing wrong in accessing the site, and Google has certainly never claimed ownership of such data (and if it did, then Daniel should be part of a very long line of litigants). Are there things he can do? Yes, he could have a /robots.txt file that Google honors (The Internet Archive also honors it, but at best it's advisory and not at all mandatory—other crawlers might not honor it) or he can block IP addresses. But sadly, it was inevitable once a web-to-gopher proxy was available.

The issue at heart is that everyone is a publisher these days, but not everyone realizes that fact. Many also believe social media sites like MyFaceMeLinkedSpaceBookWeIn will keep “private” things private. The social media sites may even believe their own hype, but accidents and hacks still happen. You can block a someone, but that someone has friends who are also your friends. Things spoken or written can move in mysterious ways.

I feel I was fortunate to have experienced the Internet in the early 90s, before it commercialized. Back then, every computer was a peer on the Internet—all IP addresses were public, and anything you put out on the Internet was, for all intended purposes, public! There's nothing quite like finding yourself logged into your own computer from Russia (thanks to a 10Base2 network on the same floor as a non-computer science department with a Unix machine sans a root password). Because of that, I treat everything I put on the Internet as public (but note that I am not giving up my rights to what I say). If I don't want it known, I don't put it on the Internet.

Daniel goes on to state:

The content creator, after all, is the only person who has the right to make that decision, they are the only one who knows the audience they are willing to share something with, and the only ones who are the arbiter of that.

Ephemera, or the Consciousness of Forgetting

To me, that sounds like what Daniel really wants is DRM, which is a controversial issue on the Internet. Bits have no color, but that still doesn't keep people from trying to colorize the bits, and others mentioning that bits have no color and doing what they will with the bits in question. It's not an easy problem, nor is it just a technical problem.

You put content on the Internet. You are now a publisher with a world wide audience.

Wednesday, October 30, 2019

So this is what it feels like on the other side

I subscribed to a mailing list today. I had to wait until the validation email passed through the greylist daemon on my system, but once that happened, I can start replying to the list.

Only the first post I made didn't go through. There was no error reported. There was no bounce message. Nothing. I checked to make sure I was using the address I signed up with (I did) and the filters on my email program were correct (they were).

I then checked the logs and behold:

Oct 30 19:07:41 brevard postfix/smtp: 023E22EA679B: 
	status=deferred (host XXX­XXXXXXXX­XXXXXX[XX­XXXXXXXX­XXX] said: 
		Recipient address rejected: Greylisted, 
		(in reply to RCPT TO command))

Ha! I'm being greylisted right back! This is the first time I've noticed my outgoing email being greylisted. I find this amusing.

Thursday, October 31, 2019

In theory, it should work the same on the testing server as well as the production server

I haven't mentioned the other server I wrote, GLV-1.12556. I wrote it a few months ago mainly as a means to test out the Lua TLS wrapper I wrote beacuse otherwise, the wrapper is just an intellectual exercise. It implements the Gemini protocol which lies somewhat between gopher and HTTP.

One issue keeps rearing its ugly head—files larger than some size just aren't transfered. It just causes an error that I haven't been able to figure out. The first time this happened several months ago, I hacked at the code and thought I got it working. Alas, it's happening again. I received word today of it failing to send files beyond a certain size, and yes, I can reproduce it.

But here's the kicker—I can reproduce it on my live server but I can't reproduce it locally. It only seems to happen across the Internet. So any testing now has to happen “live” (as it were) on the “production server” (grrrrrr). I fixed some possible issues, maybe, like this bit of code:

  ios._drain = function(self,data)
    local bytes = self.__ctx:write(data)
    if bytes == tls.ERROR then
      -- --------------------------------------------------------------------
      -- I was receiving "Resource temporarily unavailable" and trying again,
      -- but that strategy fails upon failure to read a certificate.  So now
      -- I'm back to returning an error.  Let's hope this works this time.
      -- --------------------------------------------------------------------
      return false,self.__ctx:error()
    elseif bytes == tls.WANT_INPUT or bytes == tls.WANT_OUTPUT then
      self.__resume = true
      return self:_drain(data)
    elseif bytes < #data then
      self.__resume = true
      return self:_drain(data:sub(bytes+1,-1))
    return true

Upon looking over this code, I rethought the logic if dealing with tls.WANT_INPUT (the TLS layer needs the underlying socket descriptor to be readable) or tls.WANT_OUTPUT (the TLS layer needs the underlying socket descriptor to be writable) with the same bit of code, and rewrote it thusly:

  ios._drain = function(self,data)
    local bytes = self.__ctx:write(data)
    if bytes == tls.ERROR then
      -- --------------------------------------------------------------------
      -- I was receiving "Resource temporarily unavailable" and trying again,
      -- but that strategy fails upon failure to read a certificate.  So now
      -- I'm back to returning an error.  Let's hope this works this time.
      -- --------------------------------------------------------------------
      return false,self.__ctx:error()
    elseif bytes == tls.WANT_INPUT then
      self.__resume = true
      return self:_drain(data)
    elseif bytes == tls.WANT_OUTPUT then
      self.__resume = true
      return self:_drain(data)
    elseif bytes < #data then
      self.__resume = true
      return self:_drain(data:sub(bytes+1,-1))
    return true

Now, upon receiving a tls.WANT_OUTPUT it updates the events on the underlying socket descriptor from “read ready” (which is always true) to “read and write ready.” But even that didn't fix the issue.

I then spent the time trying to determine the threshhold, creating files of various sizes until I got two that differed by just one byte. Any file that has 11,466 bytes or less will get served up. Any file that is 11,467 bytes or more, the connection is closed with the error “Resource temporarily unavailable.” I have yet to figure out the cause of that. Weird.

Where, indeed

Bunny and I are out, having a late dinner this Hallowe'en when I notice a woman walking in dressed as Carmen Sandiego. I never did find her husband, Waldo. Go figure.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site:, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.