The Boston Diaries

Sunday, January 01, 2023

“I love the smell of black powder at night”

I think I'm resigned to the fact that every January 1^st and July 4^th I have to suffer living in a war zone. Unlike past years, Bunny and I decided to head outside and at least enjoy the show. The most impressive ones were at the other end of our street—huge ones shot perhaps only a couple hundred feet if that, in the air, with the embers nearly hitting the nearby roofs still lit accompanied by the loud thunderclap a second or two later.

Ouch.

Other than the loud explostions all around us, it was otherwise a quiet New Year.

Yeah.

HAPPY NEW YEAR!

Monday, January 02, 2023

Some notes on working with old C code

Recent postings have concerned testing software–therefore, how can we test trek? A goal is good to have, as that may influence what we test and how we go about it. Say we want to modernize the source code,
int main(argc, argv)
int argc;
char **argv;
{
    int ch;
    ...
perhaps on account of it using rather old C that modern C compilers might eventually come to hate. A code refactor may break the game in various ways; whether the code still compiles is a test, but may not catch the logic bugs we may introduce: sorting a list the wrong way, doubling damage numbers, things like that.

Trek

The post is briefly about testing the old game Star Trek that was popular in the 70s, and in this case, it's been ported to C probably sometime in the 80s. The game is interactive so writing any form of tests (much less “unit tests”) will be challenging, to say the least. (the post does go on to say that possibly using expect would probably be in order).

I have a bit of experience here with older C code. At one point, I got interested in Viola, an early graphical web browser from 1992, and is the type of C code that gives C a bad name. It was written in a transition period between K&R C and ANSI C, so it had to cater to both, so function prototypes were spotty at best. It also made several horrible assumptions about primitive data types—mainly that pointers, integers and long integers were all interchangable, and plenty of cases where signed quantities were compared to unsigned quantities.

Horrible stuff.

The first thing I did was rewrite the Makefile to simplify it. The original build system was a mess of scripts and over 7,000 lines of make across 48 files; I was able to get it down to just 50 lines of make in one file. Of course, I'm only interested in getting this to compile on POSIX systems with X Windows, so it's easy to simplify the build system.

Second step—crank the C compiler warnings to 11, and fix all the warnings. That's when I started finding all the buried bodies—longs and pointers interchanging, questionable casts, signed and unsigned comparisons and much much more. I got it to the point where it works on a 32-bit system, and it compiles on a 64-bit system but promptly crashes. And by “works” I mean, I can browse gopher with it, but trying to surf the modern web is laughably disasterous.

But I digress—the first is to crank the compiler warnings to 11 and fix all the warnings, then convert K&R C function declarations to ANSI C.

Does that count as “refactoring?” I personally don't think so—it's just mechanical changes and maybe fixing a few declarations from plain int to unsigned int (or size_t). And once that is done, then you can think about refactoring the code.

Discussions about this entry

It still surprises me what some find difficult to do

There's been ongoing discussions in Gemini about a webmention like mechanism. So I was intrigued by this statement:

The problem here is, that this mechanism includes some script that adds some complexity to the maintenance of the gemini capsule. As bacardi55 writes:

I do know that asking capsule owners to deploy a script will be the biggest concern here, but I guess there is always a "price to pay" … Yes it will require a CGI script, but it should be doable even with a small bash script to not add too much complexity in maintaining a capsule.

I agree that some kind of programming and scripting will be necessary to get notified. However I think that we can do it at least without a CGI-script. Here is the way I think I have found.

Gemlog responses - bacardi55's concept without CGI

And he goes on to implement a scheme that adds complexity to the configuration of the server, plus the issues with scheduling a program to scan the logfiles for Gemini requests. I've done the logfile scanning for “Project: Wolowizard” and “Project: Lumbergh” and it was not any easy thing to set up. Okay, in my case, it was checking the logs in real time to see if messages got logged as part of testing, but that aside, checking the logs for requests might not be straightforward. In this case, it soulds like he has easy access to the log files—but that is not always the case. There have been plenty of systems I've come across where normal users just don't have access to the logs (and I find it annoying, but that's a rant for another time). Then there's scheduling a script to run at a regular schedule. In the past, this would be cron and the bizarre syntax it uses, but I'm not sure what the new hipster Linux systemd way is these days (which itself is a whole bag of worms).

And it's not like the CGI script has to be all difficult. Here's a script that should work (it's somewhat untested—I have the concept tested and running on my Gemini server, as an an extension to my Gemini server and the CGI script below is based upon that extension):

#!/usr/bin/env lua

query = os.getenv("QUERY_STRING")
if query == "" then
  io.stdout:write("10 URL to send\r\n")
else
  query = query:gsub("%%%x%x", -- decode URL encoded data
            function(c)
              return string.char(tonumber(c:sub(2),16))
            end)
  mail = io.popen("/usr/sbin/sendmail me") -- send email
  if mail then
    mail:write(string.fomrmat([[
From: <me> (replace with real email address)
To: <me>
Subject: Mention, sir!

%s
]],query)
  io.stdout:write("20 text/plain\r\nIt has been accepted.\r\n")
end

os.exit(0)

Yes, this relies upon the email recipient to check the URI has the proper link, but it's simple and to the point. The only issue here is getting the Gemini server to run this script when /.well-known/mention is requested, and I feel that is easier than dealing with scanning logfiles and running cron jobs, but that's me.

As far as the actual proposal itself, I don't have much to comment about it, except to say I don't like the mandated text it uses in the link. I think just finding the link should be good enough. Even better in my mind would be two links, much like webmention uses, but that doesn't seem to be a popular opinion.

Discussions about this entry

Wednesday, January 04, 2023

Thoughts on an implementation of Gemini mentions

The other day I didn't have much to say about the Gemini Mentions proposal. Now that I've implemented it for my Gemini site (the code has been upated extensively since the other day), I have more thoughts.

First, having the location locked to /.well-known/mention works fine for single-user sites, but it doesn't work that well for sites that host multiple users under a single domain. Alice who has pages under gemini://example.com/alice/ and want to participate with Gemini mentions. So might Dave under gemini://example.com/dave/. Bob, who has pages under gemini://example.com/bob/ doesn't care, nor does Carol, under gemini://example.com/carol/. How to manage gemini://example.com/.well-known/mentions where half the users want it, and the other half don't? Having the ability to specify individual endpoints, say with a CGI script, would at least let Alice and Dave participate without having to bug the example.com admin to install a service under a single location.

Second, not every person may want to have every page to receive a mention. I know I don't—I want to restrict mentions to the blog portion of my Gemini site. The proposal only states that “a capsule owner MUST implement a basic endpoint: /.well-known/mention,” but it says nothing about limiting what pages can be the target of a mention. I suppose having a link to /.well-known/mentions on a page could indicate that page can receive mentions, but the implication is that the endpoint link doesn't have to be mentioned at all. For now, I just filter requests to my blog entries and for other pages I return a “bad request.”

Third, I'm still unsure about sending a single URI. My implementation does scan the given URI for links to my blog, and will grab the first link that matches a blog entry from the URI (and ignores other links to my Gemini site—see point above). Sending in two links, as in a webmention provides some form of check on the request.

Fourth, I don't check for the “RE:” in the link text as I don't think it's needed. The specification implies it has to be “RE:” (in all caps), but I can see “Re:” and “re:” being used as well, because humans are going to human and be lazy (or trollish and use “rE:” just to mess with people; or not include it at all).

I also did a second implemenation that addresses all these points (and the code for this version is very similar to the other one). I guess I'll see which one becomes more popular.

Discussions about this entry

Friday, January 06, 2023

“The street finds its own uses for things.”

There's a little bit of pushback on the whole Gemini mentions concept. Sandra wrote:

I had Atom and was pretty happy with that and people were like “why don’t you implement Gemini too” and I did and it was a bee and a half because back then almost no Gemini server supported different languages for different pages without serious hoops and then gmisub and then broken redirects and then dir traversal and then this and then that and then the other and after a while it’s all hacking and no writing.

…

I really, really don’t wanna implement this and that means either there’s a non-zero amount of grumpy grognards who don’t wanna do it (in which case you’re gonna have to use the other methods anyway, like Cosmos), so there’s no point in doing it, or I’m gonna get dragged kicking and screaming into doing it which I really hope does not happen.

I think bacardi55 is cool and I haven’t wanted to say anything about the project out of the “if you can’ [sic] say anything nice…” principle but then it seemed as if it were picking up steam and getting implemented.

Gemini mention, an ongoing discussion

I'm not familiar with the “was a bee and a half” idiom, but I suspect it means something like “annoying,” given the context. And if supporting Gemini was “annoying” then why even continue with it? The issues brought up, like the lack of per-page language support, were found by people trying to use Gemini, finding issues, and solving the issues. It would have been easy for most of the issues to be ignored, thanks to Gemini's “simplicity of implementatin über alles.” That would not have been a good idea long term, and thus, Gemini gets complex.

And Gemini mentions aren't mandatory, just like not every website supports webmentions. Don't like it? Don't bother with it. Taken to the limit, “I really hope does not happen” applied to Gemini means Gemini doesn't exist (and there are plenty of people who questioned the concept of Gemini).

And as bacardi55 said:

The main reason I "jumped" into this "issue" can be reduced to one sentence: I did it for me :)

Why did I work on gemini mention

If others find it useful, so be it. As William Gibson said: “The street finds its own uses for things.” Besides, given my past experience with the Gemini community, I think there will be only two sites supporting Gemini mentions.

Sunday, January 08, 2023

Today's date happens more frequently on Sunday than any other day of the week

Five years ago, I posted that January 8^th is less like to occur on Monday. At the time, I just accepted it, but when I recently came across that post a few days ago, I figured I should actually see if that's true. I ran the numbers from 1583 (the first full year under the Gregorian calendar) to now:

Number of times January 8^th fell on a day of the week, since 1583
Sunday	65
Friday	64
Tuesday	64
Wednesday	63
Thursday	62
Saturday	62
Monday	61

What are the odds I'd find this result on a Sunday? [High, given your results. —Editor] [Har har. —Sean] I was expecting the results to be nearly equal. I also find it funny that the actual average, 63, happens on Wednesday, the most average day of the week (you see, Wednesday being in the middle of the week and the average is … oh bother!). I wonder what causes this?

Discussions about this entry

Today's date happens more frequently on Sunday than any other day of the week | Lobsters

Monday, January 09, 2023

An epiphany about bloated web pages might be the result of a dumb network

I was scared by an epiphany I had the other day when I read a quote by John Carmack. But before I get to the quote and the ephiphany, I need to give some background to understand where I was, and where I am.

First, for the years I was working for The Corporation (and later, The Enterprise), I was in essense, working in telephony networking, and I was never a fan of telephony networking (the Protocol Stack From Hell notwithstanding).

Basically, the paradigm in telephony is a “smart network” and a “dumb edge.” All the “intelligence” of an application on telephony is on the network side of things—the “edge” here being the device the end user interacts with. In the old days, this was an on-off switch, a microphone and a speaker. Later models this device included a tone generator. So any features needed to be handled on the network side because the end user device (the “edge”) was incapable of doing much at all. If a person wants a new feature, they have to get it implemented on the entire network, or it's effectively not supported at all (because there's not much one can do with an on-off switch, speaker, microphone and a tone generator).

Contrast this with the Internet—it's a “dumb network” with a “smart edge”—all the network has to do is sling packets back and forth, not concerning itself with the contents. The “edge” in this case is (was?) a general purpose computer that can be programmed to do just about anything. So if a person wants a new feature, all that's needed is a program on at least two endpoints and said feature exists—there's no need to inform the rest of the network of it, as long as the “dumb network” can do its job and sling the data between the two endpoints. Want an alternative to the web? Just do it. Want an alternative to IRC? Just do it.

Second, I have always had a hard time understanding why people keep insisting on writing bespoke web browsers in JavaScript that just show text, when the user is already using a web browser has already been written to display text. The poster child for this (in my opinion) is the Portland Pattern Repository, a large repository of programming wisdom, that, for whatever reason, Ward Cunningham (creator of the site) felt that a normal web browser wasn't good enough to browse a text-only website and thus demands the latest and greatest in JavaScript conformance to view text. He's free to do so, but I find it annoying that I can no longer read a site I enjoyed (and even contributed to), just because I haven't updated my browser for the past twenty minutes. I'm not even asking to participate in editing the site any more, I just want to read it!

And finally we get to the John Carmack quote:

It is amusing to consider how much of the world you could serve something like Twitter to from a single beefy server if it really was just shuffling tweet sized buffers to network offload cards. Smart clients instead of web pages could make a very large difference.

John Carmack Tweet

Oh crap.

“Smart clients”—“smart edge.”

“Web pages”—“data.”

My dislike of the Portland Pattern Repository just got ran over by my liking of dumb networks and smart edges.

Ward Cunningham wants a smarter edge to view his site (and to “improve server performance” if you read the comments in the web page returned from the site) and I can't begrudge him that—I like smart edges! It makes more sense to me than a smart network. But at the same time, I want a web site to just return text to a “dumb browser,” even if the browser I'm using is not particularly dumb.

Do we, in fact, have too much intelligence in web servers? Do we want to push all the intelligence to the client? Do I have to reconcile my love of simple web clients and intelligent web servers with my love of the dumb network and smart edges? (And to spell it out—the “network” in this analogy is the web server and the “edge” is the web browser) Where does the simplicity need to reside?

Discussions about this entry

THE EDGE OF INSANITY

Wednesday, January 11, 2023

It's apparently a valid URL, despite it being malformed in my opinion

I've had a few posts make it to the front page of Lobsters. Lobsters supports webmention, yet I never received a webmention for those two posts. I checked the logs and yes, they were received but I rejected them with a “bad request.” It took a bit of sleuthing, but I found the root cause—the URL of my post was, accoring to my code, invalid. Lobsters was sending in a URL of the form https://boston.conman.org//2023/01/02.1—notice the two slashes in front of the path. My code was having none of that.

I'm not sure why Lobsters was sending a URL of that form as previous webmentions worked fine, but when I checked previous submissions to Lobsters I saw some of the links had a double slash in the path portion. As it's considered valid by the What Working Group? “living standard,” I ended up having to accept what I consider a malformed URL.

Sigh.

Thursday, January 12, 2023

It's probably a good thing some malformed URLs are considered “valid”

It seems it's all too easy to generate double slashes in the path component of a URL, because I received via email a report that my current feed files all had that issue.

Sigh.

I made a change a few months ago in how I internally store the base URL of my blog. It used to be that I did not store the trailing slash (so that "https://boston.conman.org/" would be stored as "https://bost.conman.org") so I had code to keep adding it back in when generating links. I changed the code to store the tailing slash, but missed one section of code because I don't subscribe to any of my feed files and didn't notice the issue.

I also fixed an actual crashing bug. All I have to say about that is that web robots are quite good at generating really garbage requests using a variety of methods—it's like free fuzz testing! Woo hoo! Sob!

Monday, January 16, 2023

The other SFTP that never was

For reasons, I'm doing some research into the history of FTP when I come across an RFC for SFTP. Only this isn't the SFTP that is used today, but instead the Simple File Transfer Protocol from 1984. Unlike TFTP, it uses TCP, and unlike FTP, it only uses a single network connection.

But this bit is why I'm writing about this:

Random Access

Pro: Wouldn't it be nice if (WIBNIF) SFTP had a way of accessing parts of a file?

Con: Forget it, this is supposed to be SIMPLE file transfer. If you need random access use real FTP (oops, real FTP doesn't have random access either – invent another protocol?).

Resolution: I have not made any provision for Random Access.

That “other protocol” would take several more years to be invented, and then take over the networking world.

Thursday, January 19, 2023

The good news? Somebody wants to use my blogging engine. The bad news? Somebody wants to use my blogging engine

Over the 23 year history of mod_blog, I've given up on the notion of anyone other than me using it. There was only one other person who used it for just a few months before deciding blogging wasn't for him and that was way back in 2002. So it was completely by surprise that I recently received a bug report on it.

Oh my … someone else is trying to use it.

I never did fully document it. And there are, as I'm finding, an amazing number of things I'm assuming about the environment, such as:

That it's running under Apache. I do make use of the environment variable $DOCUMENT_ROOT, which technically is Apache specific (per the CGI RFC “The Comman Gateway Interface Version 1.1” as it's not documented there) and, as I found out over the years, the variables $REDIRECT_REMOTE_USER and $REDIRECT_BLOG_CONFIG. Other web servers might not define those, or might work differently. I don't know, I only have ever used mod_blog with Apache.
How to configure Apache to run mod_blog. I wanted to hide the fact that I'm running a CGI program to drive my blog, not for “security-through-obscurity” reasons, but for “easy to understand and modify the URL” reasons. I think URLs like https://boston.conman.org/2023/01/19.1 looks much nicer than https://boston.conman.org/boston.cgi?2023/01/19.1 (and nevermind the hideousness of https://boston.conman.org/cgi-bin/boston.cgi?year=2023&month=1&day=19&entry=1 that it could have been). The other benefit is that if I ever do get around to making mod_blog an actual Apache module (which was my original intent) links won't break.

As such, I use Apache's RewriteRule to map all requests through mod_blog. The code base also assumes this as it relies upon the environment variable $PATH_INFO always being set, which isn't a given, depending upon how a CGI program is referenced via the web.
The environment variable $BLOG_CONFIG is set to the configuration file. The configuration file can be either specified via the command line or stored in the environment variable. I added the environment to avoid having to embed the location in the executable or to expose the location in the query portion of a URL. And again, this comes back to the previous point—how to configure this under Apache (SetEnv is the answer). I also have it set in my own environment (command line) as it makes it easy to test. It also makes it easy to fix spelling mistakes on the server as I can directly edit the files, which leads into the next point.
All the files used by mod_blog are readable and writable by the program. My blog is, as far as I can tell, unique in that I can send in posts via email, in addition to a web page. Email support, for me, was non-negotiable. I get to use my preferred editor for writing, and by posting it via email, everything is handled automatically. I'm not aware of any other blogging system set up this way, and this is only viable because I run my own email server on the same box as my webserver.

The issue becomes one of permissions. The web server runs as its own user. Email is delivered as the user of the recipient. Both can add new posts. I solved that issue my making mod_blog always run under my userid (it's “setuid” for the technically proficient). This means I don't have to make a bunch of files world writable on my server. I can make edits on the files directly as me. I can add entries via the web, email, or as a file from the command line (which mod_blog also supports).

And that's just off the top of my head. There's probably more assumptions made that I'm just not thinking of. It's issues like these where one can spend 90% of the time writing 90% of the code, and then spend another 90% of the time writing the final 10% of the code and documentation.

I'm also amused by the timing. Back in August, I removed a ton of optional code that I never used, and because no one else was using mod_blog, it was just sitting there untested. And now someone wants to use the code.

Heh.

But also, gulp! I've got 23 years of experience with the code, so I know all the ins and outs of using it. Documenting this? So someone else can use this? Good lord!

Monday, January 23, 2023

A few small differences

I received the following patch for my DNS library:

I am hoping to use this library to encode and decode mDNS queries and responses. It seems that the mDNS is mostly the same as unicast DNS, except for a few small differences which I aim to add to this PR as I encounter them.

Mdns mods by oviano · Pull Request #13 · spc476/SPCDNS

Those “few small differences” turn out not to be so small.

The main RFCs for mDNS appear to be RFC-6762 and RFC-6763 and to support them in full requires breaking changes to my library. The first are a bunch of flags, defined in RFC-6762 and it affects pretty much the entire codebase. The first deals with “Questions Requesting Unicast Responses.” Most flags are defined in the header section, but for this, it's “the top bit in the class field of a DNS question as the unicast-response bit.” And because mDNS specifically allows multiple questions, it's seems like it could be set per-question, and not per the request as a whole, as the RFC states: “[w]hen this bit is set in a question, it indicates that the querier is willing to accept unicast replies in response to this specific query, as well as the usual multicast responses.” To me, that says, “each resource record needs a flag for a unicast reponse.” The other bit the “outdated cache entry” bit. which again applies to individual resource records and not to the request as a whole. And again, to me, that says, “each resoure record needs a flag to invalidate previously cached values.”

How to handle this … well, one way would be to a Boolean field to each resource record type to hide protocol details (which was the point in this library frankly). But that can break existing code as the new fields will need initialization:

dns_question_t domain;

domain.name  = host;
domain.type  = RR_A;
domain.class = CLASS_IN;
domain.uc    = true; /* we want unicast reply */

/* and the other flag */

dns_a_t addr;

addr.name    = host;
addr.type    = RR_A;
addr.class   = CLASS_IN;
addr.ttl     = 0;
addr.ic      = true; /* invalidate cache data */
addr.address = address;

and document that the uc and ic fields are for mDNS use; if you aren't using mDNS, then they should be set to false.

Another approach is to leak protocol details and require the user to do something like:

/* We're making a query and want a unicast reply */
dns_question_t domain;

domain.name  = host;
domain.type  = RR_A;
domain.class = CLASS_IN | UNICAST_REPLY;

/* We're replying to a query and want to invalidate this record */
dns_a_t addr;

addr.name    = host;
addr.type    = RR_A;
addr.class   = CLASS_IN | INVALIDATE_CACHE;
addr.ttl     = 0;
addr.address = address;

And that's a less-breaking change, but on the decoding side, I still need some form of flag in the structure to indicate these flags were set because otherwise data is lost.

I'm not sure which approach is best. The first does a better job of hiding the DNS protocol details, but breaks more code. The second is less breaking, as I could ignore any cache flags on encoding, but it leaks details of DNS encoding to user code. I tend to favor the first but I really dislike the breaking aspect of it. And That's just the first RFC.

The other RFC utilizes what I consider to be an implementation detail of the DNS protocol to radically alter how I handle text resource records. The RFC that defined modern DNS, RFC-1035, describes the format for a text resource record, but is silent as to semantics.

Individual resource records come with a 16-bit length, so in theory, a resource record could be up to 65535 bytes in size, but it's rare to get a record that size. The base type of a text resource record is a “string.” and RFC-1035 defines a “string” as one byte for the length, followed by that many bytes as the contents. The length of a “string” is defined as one byte, which limits the length of 255 bytes in size. This means, in practice, that a text resource record can contain several “strings.”

How SPCDNS handles this now is that I assume a text resource record only has one value—a string:

typedef struct dns_txt_t        /* RFC-1035 */
{
  char const  *name;
  dns_type_t   type;
  dns_class_t  class;
  TTL          ttl;
  size_t       len;
  char const  *text;
} dns_txt_t;

When encoding such a record, I break the given string into as few DNS “strings” as possible. Give this a 300 byte string, and you get two DNS “strings” encoded, one being 255 byte long, and the other one 45 bytes long. Upon decoding, all the strings in a single text resource record are concatenated into a single string. As I said, DNS-1035 doesn't go into the semantics of a text resource record, and I did what I felt was best.

RFC-6763 uses the DNS “string” encoding for semantic information:

Apple TV - Office._airplay._tcp.local.	   10	IN	TXT	(
	"acl=0"
	"btaddr=00:00:00:00:00:00"
	"deviceid=A8:51:AB:10:21:AE"
	"fex=1d9/St5/FbwooQ"
	"features=0x4A7FDFD5,0xBC157FDE"
	"flags=0x18644"
	"gid=F014C3FF-1420-4374-81DE-237CD6892579"
	"igl=1"
	"gcgl=1"
	"model=AppleTV14,1"
	"protovers=1.1"
	"pi=c6fe9e6e-cec2-44c8-9c66-8994c6ad47"
	"depsi=4A342DB4-3A0C-47A6-9143-9F6BF83F0EDD"
	"pk=5ab1ac3988a6a358db0a6e71a18d31b8d525ec30ce81a4b7b20f2630449f6591"
	"srcvers=670.6.2"
	"osvers=16.2"
	"vv=2"
	)

I have to admit, this is ingenious—each DNS “string” here defines a name/value pair. But I did not see this use at all.

I wonder how much code out there dealing with DNS packets (not specifically mDNS) would treat these records:

	IN	TXT	"v=spf1 +mx +ip4:71.19.142.20/32 -all" 
	IN	TXT	"google-site-verification=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

the same way as:

	IN	TXT	(
		"v=spf1 +mx +ip4:71.19.142.20/32 -all" 
		"google-site-verification=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
	)

The first returns two text resource records, each consisting of a single DNS “string,” the second one text resource record but with two DNS “strings.” My gut feeling is “not many would deal with the second format” but I can't know that for sure.

And changing how I deal with text resource records in SPCDNS would be a major breaking change.

This is one change I really don't know how to approach.

Tuesday, January 24, 2023

Notes on an overheard conversation about “Muskrat Love” as it played on satellite radio

“If you would have asked me who sang that, I wouldn't have been able to answer.”

“Wow! Captain and Tennille! That takes me back.”

“Me too.”

“Do you want to know what else the Captain and Tennille reminds me of?”

“What?”

“The Bionic Watermelon.”

“What?”

“The Bionic Watermelon.”

“You are weird, sir.”

Notes on a seriously first world problem

“채널 하나 둘 셋 …”

“Why is the TV speaking Korean?”

“채널 하나 둘 넷 …”

“I don't know. It just started happening!”

“채널 하나 둘 다섯 … ”

“Let me see … wait! The configuration menu is also in Korean!”

“당연하지 …”

“I guess we're just going to have to learn Korean.”

“무아하하하하 …”

Saturday, February 04, 2023

Notes about an overheard conversation while driving home

“Why the convoluted way home?”

“Are you driving?”

“No. I'm just curious.”

“Because you told me to go around.”

“I told you to get into the other lane to go around the bus!”

“No, you just told me to go around.”

“The bus.”

“Around. Besides, this way, I don't have to take a left turn.”

“Pththththththth.”

“Argument of last resort, I see.”

Monday, February 13, 2023

The Nile is nice this time of year

On Friday, February 3^rd, I broke my glasses. I was out and someone complemented me on my shades. I pointed out that they were just clip on shades, but I went further to show that my glasses were flexible. That's when I snapped off the left arm of my glasses at the hinge. In retrospect, I should not have done that.

But they were nineteen years old. And it was clear to Bunny that I needed new glasses anyway. As she keeps pointing out, my glasses would slowly creep down my face, but that was only to keep things in focus. It had nothing to do with my eye sight changing.

Nope.

But now I had no excuse. The next day I picked out new frames (Flexon, same manufactorer as my old ones). One of the store employees tried to fix my existing pair of glasses with tape, and all I can say about that—it was an attempt. The employee also managed to knock off the nose pad on the left side of my glasses (sigh) so now the glasses were even less stable on my face than before. I did manage to get an appointment for an eye exam on Monday the 6^th.

Monday, and I go for the exam. Things were going well until the end, when the doctor pointed out that it was time for me to get progressive lenses. Or, you know, bifocals.

No! I am not that old! I don't need bifocals! I'm still only … um … oh … mumblety-mum years old.

Man, the Nile is a nice place, isn't it?

I could expect the new glasses to be ready in seven to ten days.

Eight days of my old glasses falling off my face (and constantly adjusting them when they don't), and my new glasses are ready. With “progressive” lenses. I have up to 30 days to decide if I like them, and if I don't, I can get … sigh … bifocals.

[Picture of me with my new glasses] Pay no attention to all the white hair—we're here for the glasses

The progressive lenses are weird. Parts of my peripheral vision are blurry. If I move my head back and forth, surfaces along the bottom of my glasses undulate in an unnerving manner. Sometimes when I tilt my head, it feels like (to borrow a movie term) a zoom-in but with improper focusing. It's trippy, but without the side effects of a bad drug trip.

We'll see if I can get used to them.

Oh, and one more amusing fact about my new glasses—the lenses are so think at the edges, that the arms don't fold down all the way.

Thursday, February 16, 2023

I guess now Bunny can add “upholsterer” to her list of hobbies

A few weeks ago, the top arm coverings of my office chair basically crumbled and fell off.

[Picture of a chair with arms, but the arms are missing a cover.] Black always makes things slimmer. Only in this case, it is slimmer without the rubber covering.

The old coverings were some combination of rubber and plastic and I guess over time, they just became brittle or dried out, and fell apart. This exposed the underlying hard plastic frame underneath. It wouldn't be that bad actually, except for all the square holes, used to both lessen the amount of hard plastic required, and to give the old covers something to grip onto.

[Closeup of an uncovered arm on the office chair showing a surface with cutouts to save material, but makes it uncomfortable to use as an arm rest.] It's a combination arm rest and cheese grater!

Resting my arms on the bare arm rests is uncomfortable—it's not exactly cutting into my skin, but I can feel the square holes which is unpleasant, and left a square pattern on my arms. My idea was to take some foam and wrap some cloth type material around it and the plastic frame. But it was Bunny who made the new covers from material lying about Chez Boca.

[Picture of a new arm cover for an office chair.] Grab a strobe black light, turn up the techo, and we can have a rave!

It's basically a tube of cloth wrapping the foam, with some extra material folded back onto itself to form some flaps to go around the ends of the arm rests. Here I am demonstrating how it works with my fingers.

[Picture demonstrating the flap on the new arm covers to hold it onto the chair.] Makes for a lousy puppet. Maybe with some googly eyes?

The material has some stretch ability, which helps to keep it on the arm rests.

[Picture of a chair with arms, now with the new arm covers.] Is it not nifty?

It adds a nice bit of color to the chair, and it's a lot more confortable than the old covering. Nice job indeed!

Thursday, February 23, 2023

A breakdown of the triple-star pointer

I few days ago I read “Lessons learnt while trying to modernize some C code” (via Lobsters) and one of the problems of C stated stood out to me: “Avoid constructs like char ***. I thought it was a joke, but people do pass around char *** and it’s insane—almost impossible to comprehend why do you need a pointer to a pointer to a pointer.” Yes, it happens, but come on! That doesn't happen often enough to complain about!

And then I found one in my own code!

Sigh.

Okay, at least I can explain why I needed a char ***. It's not insane, and it's not impossible to comprehend why. I'll start with char *'. In C, that means “string” (the exceptions are just that—exceptions). We can replace char * with typedef char *cstring which gets rid of one “*”, leaving effectively cstring **.

Now, when you see char **, say in int main(int argc,char **argv), it generally has the meaning of an array of strings: int main(int argc,char *argv[]). Sometimes it could mean just a pointer to a pointer, but I'm using the “array of strings” meaning in my code. Translated using the custom type I defined above, char ** becomes becomes cstring [] and char *** becomes cstring *[]—a pointer to an array of strings. And this idiom, when it happens, usually means the callee is going to allocate the memory for the array of strings and return it via the pointer. Which is exactly what the function I wrote does.

So when I expect a char *** here, what I'm asking for is a pointer to an array of strings (aka character pointers or character arrays). The only thing insane about this is the syntax, and maybe the semantics (pointers and arrays are near enough the same that it's dangerous) but I've been working with C long enough that I just kind of accept it.

Now, just don't ask about char ****—that's just silly talk!

Discussions about this entry

Friday, February 24, 2023

A branchless segment of code to generate a printable hexadecimal value

I was an avid fan of assembly language back in my youth and I did a lot of it. And in that time, if I needed to convert a 4-bit quantity to a hexadecimal character, I would write the obvious code:

	; x86 code
		add	al,'0'
		cmp	al,'9'	; if '0'-'9', no adjustment needed
		jbe	skip	; otherwise, we need to adjust
		add	al,7	; the resulting character by 7
				; to get 'A'-'F'
skip:

Eight bytes and a branch instruction, and not many ways I could see to improve on that, until the other day when I came across this bit of code:

	; x86 code
		add	al,90h
		daa
		adc	al,40h
		daa

Not only does this convert a 4-bit value to a hexadecial character, but it's two bytes shorter and it's branchless!

Now, some might say this abuses the DAA instruction, but it works. And how it works is pretty clever I think. The DAA instruction exists to allow BCD arithmetic (back when it was a thing). For each 4-bits in a byte, the DAA instruction will check to see if it's in the range of 10 to 15 and if so, add 6 to that 4-bit value to bring it back into the 0 to 9 range, and propagate a carry bit (well, it's a bit more involved than that, but that will suffice for this post—you can check my MC6809 emulator for the gory details of the DAA instruction). By adding 0x90 (or 144 in decimal) to a 4-bit value then using DAA, a carry bit will be propagated if the initial value was 10 to 15; otherwise there's no carry to propagate. The ADC of 0x40 (or 64 decimal) will then add any carry of the previous two instructions into the lower four bits of the result, and the DAA will then adjust the upper 4-bits to be either 0x3 or 0x4 due to the previous addition of 0x90 (which causes the number to act like a negative number if the initial value was bewteen 0 and 9). And because of the carry if the initial 4-bit value was between 10 to 15, you get the required adjustment of 7 needed for values of 10 through 15.

This means the result is 0x30 to 0x39 (the ASCII values of “0” to “9”) of the 4-bit values of 0 through 9, or 0x41 to 0x46 (the ASCII values of “A” to “F”) for values 10 through 15.

Quite ingenious really.

I found reference to what may be the origin of this sequence: the article “A Design Philosophy for Microcomputer Architectures” from the February 1977 edition of Computer (the code appears on the third page of the article), but it's unclear if the author came up with this on his own, or it was a known sequence at the time.

I just wish I found out about it earlier.

Discussions about this entry

Wednesday, March 01, 2023

I'm seriously wondering who is trolling who at this point

I have a Gmail account. I signed up early enough to get my name as an email address at Gmail. But I never use it for anything, so by default, anything that arrives there is either spam or misaddressed. I will occasionally check it, and I found two emails from one Trudy XXXXXXXXXXX. The first one:

From
Trudy XXXXXXXXXXX <XXXXXXXXXXXXXXXXXXXXXX> (an address from a Tennessee school)

To
Sean Conner <sean.conner@gmail.com>, (and other addresses to the same Tennesee school, I checked)

Subject
IMPORTANT DEVICE INFORMATION (Waiver)

Date
Mon, 20 Feb 2023 15:51:00 -0500

Parents/Guardians:

Please check your child's Power School account to see if your child has a hold due to a damaged, stolen or lost device. If your child has a damaged, lost, stolen device, you may be eligible to fill out a waiver to allow your child to get a device for no charge.

-The waiver is limited to devices only. Keyboards, cases and chargers are not covered.

If you have already filled out a waiver, or your child's device has been returned to school in good working condition, please disregard this message.

Thursday, February 23, 2023, is our Black History Program. You will be allowed to fill out the waiver after each program (morning or afternoon show).

Part of the waiver agreement requires you to have custody of your child as shown in Power School, provide a valid ID, and be prepared to watch a short 2.5-minute video at the time of completing the waiver.

Morning show is at: 9:15 am
Afternoon show is at: 2:00pm

Thank you 🙂

[School Logo]

Trudy XXXXXXXXXXXXX
School Counselor
XXXXXXXXXX Elementary
XXXXXXXXXXXXXXXXXXXXXX
Memphis, TN 38116
901-XXXXXXXX (School)
901-XXXXXXX (Fax)
XXXXXXXXXXXXXXXXXXXXXX
"Every child deserves to be a champion: an adult who will never give up on them, who understands the power of connection and insists they become the best they can possibly be"-Rita Pierson

Together, we MUST BELIEVE.

Together, we WILL ACHIEVE.

Together, we ARE REIMAGINING 901!

Does the Sean Conner who lives in Tennesee not know his own Gmail address? I always wonder about that. But regardless, I decided to reply with a bit of surrealism.

From
Sean Conner <sean.conner@gmail.com>

To
Trudy XXXXXXXXXXX <XXXXXXXXXXXXXXXXXXXXXX>

Subject
IMPORTANT DEVICE INFORMATION (Waiver)

Date
Wed, 1 Mar 2023 14:48:00 -0500

One question I have is—what do I do if I don't have a child? The device is fine, but I don't have a kid that uses it. Do I still need a waiver?

The second question I have is—why is my non-existent child enrolled in a school in Tennessee when I live in Florida?

Thank you.

I expected no reply, or maybe a reply like “Sorry to have bothered you.” What I did not expect was this reply:

From
Trudy XXXXXXXXXXX <XXXXXXXXXXXXXXXXXXXXXX>

To
Sean Conner <sean.conner@gmail.com>

Subject
RE: EXTERNAL - Re: IMPORTANT DEVICE INFORMATION (Waiver)

Date
Wed, 1 Mar 2023 14:55:00 -0500

Hello,

If you have device that belongs to XXXXXX County Schools, we will need to arrange a way to get the device back or you will need to pay for the device.

You cannot get a waiver if your child no longer attends XXXX
Do you know if you are listed as a contact for a relative of step child who attends XXXXXXXXXX Elementary? If so what are the names?

You will need to contact the parents/ guardians and ask them to remove you from their child’s contact.

I hope that I was able to assist you

Sent from Mail for Windows

What? My email was taken seriously? Am I being trolled? Who is trolling who?

Wow.

So anyway, the other email I received from Trudy (I mean, aside from the reply I received for the first email) was this one about current attentance policies:

From
Trudy XXXXXXXXXXX <XXXXXXXXXXXXXXXXXXXXXX>

To
Sean Conner <sean.conner@gmail.com>, (and other addresses to the same Tennesee school, I checked)

Subject
School Wide Attendance/Chronic Absence IMPORTANT

Date
Mon, 20 Feb 2023 17:42:00 -0500

Greetings XXXXXXXXXX Family!

Last week we had two great days of overall school attendance, however two days is just not enough. Please make sure that you are sending your child to school every day and on time. Please review the attendance policy about excused absences. Moving forward, we will adhere strictly to the attendance policy when excusing absences.

Attendance and Excuses (Policy #6014) The XXXXXX County Board of Education believes that regular attendance is a necessary requirement of all students. All students are expected to attend school on each day that school is officially in session and remain at school for the entirety of the school day. Only the following reasons will be considered for excused absences:

Illness, injury, pregnancy, homebound circumstance, or hospitalization of student. The District may require a parent conference and/or physician verification to justify absences after the accumulation of ten (10) days of absence during a school year. Notes must be date specific and will be required for subsequent absences beyond ten (10) days.

Death or serious illness within the student's immediate family.

When the student is officially representing the school in a school sponsored activity or attendance at school-endorsed activities and verified college visits.

Special and recognized religious holidays regularly observed by persons of their faith. Any student who misses a class or day of school because of the observance of a day set aside as sacred by a recognized religious denomination of which the student is a member or adherent, where such religion calls for special observances of such day, shall have the absence from that school day or class excused and shall be entitled to make up any school work missed without the imposition of any penalty because of the absence.

A court order; a subpoena; and/or a legal court summons.

Extenuating circumstances over which the student has no control as approved by the principal.

If a student's parent, custodian or other person with legal custody or control of the student is a member of the United States Armed Forces, including a member of a state National Guard or a Reserve component called to federal active duty, the student's Principal shall give the student: a. An excused absence for one (1) day when the student's parent, custodian or other person with legal custody or control of the student is deployed; b. An additional excused absence for one (1) day when the student's parent, custodian or other person with legal custody or control of the student returns from deployment; and c. Excused absences for up to ten (10) days for visitation when the student's parent, custodian or other person with legal custody or control of the student is granted rest and recuperation leave and is stationed out of the country. d. Excused absences for up to ten (10) days cumulatively within the school year for visitation during the deployment cycle of the student's parent, custodian or other person with legal custody or control of the student. Total excused absences under this section (c) and (d) shall not exceed a total of ten (10) days within the school year. The student shall provide documentation to the school as proof of the deployment of the student's parent, custodian or other person with legal custody or control of the student.

Participation in a non-school-sponsored extracurricular activity. A school principal or the principal's designee may excuse a student from school attendance to participate in a non-school-sponsored extracurricular activity, if the following conditions are met: (1) The student provides documentation to the school as proof of the student's participation in the non-school-sponsored extracurricular activity; and (2) The student's parent, custodian, or other person with legal custody or control of the student, prior to the extracurricular activity, submits to the principal or the principal's designee a written request for the excused absence. The written request shall be submitted no later than seven (7) business days prior to the student's absence.

The written request shall include:

The student's full name and personal identification number;

The student's grade;

The dates of the student's absence;

The reason for the student's absence; and

The signature of both the student and the student's parent, custodian, or other person with legal custody or control of the student. The principal or the principal's designee shall approve, in writing, the student's participation in the non-school-sponsored extracurricular activity. The principal may limit the number and duration of non-school-sponsored extracurricular activities for which excused absences may be granted to a student during the school year; however, such the principal shall excuse no more than ten (10) absences each school year for students participating in non-school-sponsored extracurricular activities. Students receiving an excused absence under this section shall have the opportunity to make up school work missed and shall not have their class grades adversely affected for lack of class attendance or class participation due to the excused absence.

A written statement within two (2) school days of the student's return to school shall be required from the parent or guardian explaining the reason for each absence. If necessary, verification is required from an official source to justify absences. All absences other than those outlined above shall be considered unexcused.

[School Logo] Trudy XXXXXXXXXXXXX
School Counselor
XXXXXXXXXX Elementary
XXXXXXXXXXXXXXXXXXXXXX
Memphis, TN 38116
901-XXXXXXXX (School)
901-XXXXXXX (Fax)
XXXXXXXXXXXXXXXXXXXXXX
"Every child deserves to be a champion: an adult who will never give up on them, who understands the power of connection and insists they become the best they can possibly be"-Rita Pierson

Together, we MUST BELIEVE.

Together, we WILL ACHIEVE.

Together, we ARE REIMAGINING 901!

One thing stood out to me—the family name given in the greeting doesn't match my family name, nor does it match any of the family names of any of the other recipients on the email. Otherwise, much like the first email, I have to wonder why I'm receiving this. So I decided to reply to this one with a bit more sarcasm:

From
Sean Conner <sean.conner@gmail.com>

To
Trudy XXXXXXXXXXX <XXXXXXXXXXXXXXXXXXXXXX>

Subject
School Wide Attendance/Chronic Absence IMPORTANT

Date
Wed, 1 Mar 2023 14:55:00 -0500

I must say, it appears that excused absences got more lenient over the years since I was in school. Back when I was in school, excused absences were only allowed with a 10 day prior notice, or appropriate documentation from a doctor, law enforcement officer, or pardon from the governor. What is it with these weak policies towards absences? This is intolerable for my non-existent child in a school three states away!

Here, I expected Trudy to clue in—I mean, “pardon from the governor?” Who ever heard of such a policy for a school? But again, not to be outdone, I got this back from Trudy:

From
Trudy XXXXXXXXXXX <XXXXXXXXXXXXXXXXXXXXXX>

To
Sean Conner <sean.conner@gmail.com>

Subject
RE: EXTERNAL - Re: School Wide Attendance/Chronic Absence IMPORTANT

Date
Web, 1 Mar 2023 14:57:00 -0500

Thank you for your concern, please make sure that you have withdrawn your child from XXXX.

Thank you

Sent from Mail for Windows

…

I … I have no words.

What have I gotten myself into?

Update on Thursday, March 2^nd, 2023

A few more emails exchanged, and Trudy and I have straightened things out.

Monday, March 06, 2023

Another attempt at a “unit test”

Or, “What is a ‘unit test,’ part III”

The reactions to my previous post were interesting—it wasn't a “unit test.” At best, it might have been an “integration test” but because it involved actual work (i.e. interaction with the outside world via nasty nasty side effects, aka I/O) it immediately disqualified it as a “unit test.” And to be honest, I was expecting that type of reaction—it appears to me that most unit test proponents tend to avoid such “entanglements” when writing their “battle tested” code (but I'm also willing to admit that's the cynical side of me talking). There were also comments about how 100% code coverage was “unrealistic.”

Sigh.

One respondent even quoted me out of context—“… that we as programmers are not trusted to write code without tests …” and cut the rest of the sentence: “… yet we're trusted to write a ton of code untested as long as such code is testing code.” Which was my cynical take that the “unit tests” (or the code that implements “unit tests” ) are, themselves, not subjected to “unit tests.” Something I kept trying to impart to my former manager, “stop taking the unit tests as gospel! I don't even trust them!” (mainly because the business logic of the project was convoluted and marketing kept using different terms from engineering, at least engineering in my department)

But when I left off, I said there was one final function that should fit as a “unit,” and thus, perfect for “unit testing.” Again, it's from my blog engine and the function in question deals with parsing a request, like “2001/10/02.2-11/03.3” (which requests all blog posts starting from the second post on October 2^nd to the third post of November 3^rd, 2001). or “2001/11/04.2” (the second post from November 4^th, 2001).

The function tumbler_new() does no I/O (that is—no disk, network or console I/O), touches no global variables, only works with the data given to it and does some covoluted parsing of the input data—if this isn't a good candidate for “unit tests” then I don't know what is.

The tests were straightforward—a bunch of failure cases:

  tap_assert(!tumbler_new(&tumbler,"foo/12/04.1",&first,&last),"non-numberic year");
  tap_assert(!tumbler_new(&tumbler,"1999/foo/04.1",&first,&last),"non-numeric month");
  tap_assert(!tumbler_new(&tumbler,"1999/12/foo.1",&first,&last),"non-numeric day");
  tap_assert(!tumbler_new(&tumbler,"1999/12/04.foo",&first,&last),"non-numeric part");
  tap_assert(!tumbler_new(&tumbler,"1998",&first,&last),"before the start year");
  tap_assert(!tumbler_new(&tumbler,"1999/11",&first,&last),"before the start month");
  tap_assert(!tumbler_new(&tumbler,"1999/12/03",&first,&last),"before the start day");
  tap_assert(!tumbler_new(&tumbler,"1999/12/04.0",&first,&last),"part number of 0");
  tap_assert(!tumbler_new(&tumbler,"2023",&first,&last),"after the end year");
  tap_assert(!tumbler_new(&tumbler,"2022/11",&first,&last),"after the end month");
  tap_assert(!tumbler_new(&tumbler,"2022/10/07",&first,&last),"after the end day");
  tap_assert(!tumbler_new(&tumbler,"2022/10/06.21",&first,&last),"after the end part");
  tap_assert(!tumbler_new(&tumbler,"1999/00/04.1",&first,&last),"month of 0");
  tap_assert(!tumbler_new(&tumbler,"1999/13/04.1",&first,&last),"month of 13");
  tap_assert(!tumbler_new(&tumbler,"1999/12/00.1",&first,&last),"day of 0");
  tap_assert(!tumbler_new(&tumbler,"1999/12/32.1",&first,&last),"day of 32");
  tap_assert(!tumbler_new(&tumbler,"1999/12/04.0",&first,&last),"part of 0");
  tap_assert(!tumbler_new(&tumbler,"1999/12/04.24",&first,&last),"part of 24");
  tap_assert(!tumbler_new(&tumbler,"2010/07/01-04/boom.jpg",&first,&last),"file with range");
  tap_assert(!tumbler_new(&tumbler,"2010/7/1-4/boom.jpg",&first,&last),"file with redirectable range");

Plus a bunch of tests that should pass:

  test("first entry","1999/12/04.1",&(tumbler__s) {
    .start    = { .year = 1999 , .month = 12 , .day = 4 , .part = 1 },
    .stop     = { .year = 1999 , .month = 12 , .day = 4 , .part = 1 },
    .ustart   = UNIT_PART,
    .ustop    = UNIT_PART,
    .segments = 0,
    .file     = false,
    .redirect = false,
    .range    = false,
    .filename = ""
  });
  
  test("some mid entry","2010/07/04.15",&(tumbler__s) {
    .start    = { .year = 2010 , .month = 7 , .day = 4 , .part = 15 },
    .stop     = { .year = 2010 , .month = 7 , .day = 4 , .part = 15 },
    .ustart   = UNIT_PART,
    .ustop    = UNIT_PART,
    .segments = 0,
    .file     = false,
    .redirect = false,
    .range    = false,
    .filename = ""
  });
  
  test("last entry","2022/10/06.20",&(tumbler__s) {
    .start    = { .year = 2022 , .month = 10 , .day = 6 , .part = 20 },
    .stop     = { .year = 2022 , .month = 10 , .day = 6 , .part = 20 },
    .ustart   = UNIT_PART,
    .ustop    = UNIT_PART,
    .segments = 0,
    .file     = false,
    .redirect = false,
    .range    = false,
    .filename = ""
  });
  
  test("requesting a file","2010/07/04/boom.jpg",&(tumbler__s) {
    .start    = { .year = 2010 , .month = 7 , .day = 4 , .part =  1 },
    .stop     = { .year = 2010 , .month = 7 , .day = 4 , .part = 23 },
    .ustart   = UNIT_DAY,
    .ustop    = UNIT_DAY,
    .segments = 0,
    .file     = true,
    .redirect = false,
    .range    = false,
    .filename = "boom.jpg",
  });

  /* ... other tests ... */

With this function checking the results:

static void test(char const *tag,char const *tum,tumbler__s const *result)
{
  tumbler__s  tumbler;
  
  assert(tag    != NULL);
  assert(tum    != NULL);
  assert(result != NULL);

  tap_plan(10,"%s: %s",tag,tum);
  tap_assert(tumbler_new(&tumbler,tum,&first,&last),"create");
  tap_assert(btm_cmp(&tumbler.start,&result->start) == 0,"start date");
  tap_assert(btm_cmp(&tumbler.stop,&result->stop) == 0,"stop date");
  tap_assert(tumbler.ustart == result->ustart,"segment of start");
  tap_assert(tumbler.ustop == result->ustop,"segment of stop");
  tap_assert(tumbler.segments == result->segments,"number of segments");
  tap_assert(tumbler.file == result->file,"file flag");
  tap_assert(tumbler.redirect == result->redirect,"redirect flag");
  tap_assert(tumbler.range == result->range,"range flag");
  tap_assert(strcmp(tumbler.filename,result->filename) == 0,"file name");
  tap_done();
}

I ended up with a total of 328 tests and of the three attempts I made, this one feels like the only one that was worth the effort—it's a moderately long function [Moderately long? It's 450 lines long! —Editor] [But it does one thing, and one thing well—it parses a request! —Sean] [450 lines! —Editor] [I'd like to see you write it then! —Sean] [… Okay, I'll shut up now. —Editor] that implements some tricky logic and deal with some weird edge cases. If I ever go back to rework this code (and I've only revised this code once in the 23 years it's been used, way back in 2015—it was a full rewrite of the function) the tests could be useful (if I'm honest with myself, and the API/structure doesn't change). ~~And from looking over the test cases, I can see that I could get rid if .segments from the structure, so there's that.~~(Seems I was wrong---the .segments field is needed for tumbler_canonical())

Overall, I'm still not entirely sure about this “unit test” stuff, especially since “unit” doesn't have a well defined meaning. In my opinion, I think it works best for functions that do pure processing (no interaction with the outside world) and that implement some complex logic, like parsing or business logic. Back when I was at The Enterprise, had we but one function (or entry point) that just implemented all the business logic from data gathered from the request, it would have have made testing so much easier. But “unit tests” for all functions? Or modules? Or whatever the XXXX a “unit” is? No. Not for obvious code, or for for code that interacts with external systems (unless required because human lives are on the line). I'm not saying no to tests entirely, but to the slavish adherence to testing for its own sake.

Or maybe, instead of having AI write code for us, have it write the test cases for us, intead of the future I'm cynically seeing—where we write the test cases for AI written code.

Discussions about this entry

Another attempt at a “unit test” | Lobsters

Tuesday, March 07, 2023

So frustrated that I have no one to scream at, which may be the point why it's so hard to get ahold of customer representatives at tech companies these days

I'm so frustrated right now.

Bunny can't receive email, and we have no idea why that is. All we get is that there have been too many attempts to log into her account and she needs to reset the password. Now, her account is with bellsouth.net which is now owned by AT&T but email for customers is handled by Yahoo, but trying to track down a human being to talk to is a Herculean effort these days, and even if we get ahold of someone, can they even help? Forget the left hand not knowing what the right hand is doing, it appears these days that the left hand doesn't even know it has fingers!

Going to the AT&T login page mentions something about currently.com, but going to currently.com goes to yahoo.com (which, to be fair, is handling email for bellsouth.net). Man, is this confusing. We go through the process of changing the password, only to get redirected back to the login page, which fails because the password we set seems to be incorrect, and thus, we get more and more failed logins until we're forced to change the password yet again.

And the cycle continues.

Doing a search shows that we aren't alone in this, and that this issue has been an ongoing problem for several years now with no solution in sight.

Does anybody at AT&T or Yahoo know what's going on?

To add further fuel, I can't find the current server settings in the Apple Mail application, adding to my frustration.

Eventually, I gave up and set her up with Fastmail. It's not as big as Google, and it's a pay service, so there's a decent chance that talking to a human is possible. Also, it was nearly painless to set up. I say “nearly painless” because it did take me several minutes to figure out the DNS settings I needed to give Bunny her own sub-domain in conman.org (because if she's going through the pain of changing an email address from one she's used for years, I'd rather she use a domain not in the control of the email company in the off chance we need to switch providers so she can keep the same email address, and I don't want to hand off my own email to Fastmail since I'm comfortable using mutt directly on my server). Then there was one file to download that installed information on her Mac laptop that informed the Apple Mail application of the Fastmail settings and that was that. She was good to go.

It just sucks that she has to change her email address.

Monday, March 13, 2023

“We couldn't pause TV and we couldn't just fast forward through the commercials! Gather around kids, and I shall tell more horror tales of the past …”

Over a week ago, I had to order some checks from my bank due to a new recurring expense. My last check I wrote from my checkbook was for some time in 2015. Then one check from 2014, and one from 2013.

Needless to say, I haven't had a need for a checkbook in years.

So the new checkbook arrived (two of them, actually). I recall in the past, they would arrive in the mail in a box sized large enough to hold the checks. But today, they arrived in an unusual format—a flat package about 6½ × 10 × ½ inches (16.5 × 25.5 × 10 mm). Upon opening it, it looked like a book, and stuck to the inside cover were two checkbooks. I pulled them off, and beneath one of them was instructions on how to write a check (with a link to an instructional video)!

I suddenly feel old.

Notes on optimizing an O(n)+C algorithm where the C matters quite a bit

[Note: all hexadecimal values are preceeded with a “$”, which is old-school. I know these days it's hip to use “0x” to designate a hexadecimal value, but I hope this is just a passing fad. Also, the “K” here stands for “kilobyte,” which is 1024 bytes. This too, is old fasioned but I don't want to use a word that sounds like pet food. Geeze, kids these days!] [Here's an onion—put it on your belt, you geezer! —The kids] [Hey! I resemble that remark! Get off my lawn! —Sean]

I was doing a bit of retro computing over the weekend, writing 6809 code and running it on a Color Computer emulator (because the Color Computer was my first computer and the 6809 is a criminially underrated 8-bit CPU in my opinion). Part of the coding was enabling all 64K of RAM in the machine. Normally, the Color Computer only sees the lower 32K of RAM, with the upper 32K being ROM (the BASIC interpreter). To enable all 64K of RAM, all that's needed is to stuff any value into memory location $FFDF, which can be done with “POKE &HFFDF,0”. The problem with that is once the ROM goes away, so does BASIC, and the CPU starts executing Lord knows what since the RAM isn't initialized. So the actual procedure is to copy the ROM contents into RAM, which is simple enough:

	orcc	#$50	; disable interrupts
	ldx	#$8000	; start of ROM
loop	sta	$FFDE	; enable ROM mapping
	lda	,x	; read byte
	sta	$FFDF	; enable RAM
	sta	,x+	; write byte back, increment pointer
	cmpx	#$FF00	; are we done?
	bne	loop
	andcc	#$AF	; enable interrupts

We don't actually want to copy the full 32K of ROM, since the upper 256 bytes of the memory map is for I/O devices. We also disable interrupts since the default interrupt handlers are in ROM and if an interrupt happens when we have RAM mapped, there may not be an interrupt handler to handle it.

The code is straightforward, simple, and unfortunately, slow. Here's the main loop again, this time with the number of cycles and bytes each instruction takes:

			; cycles	bytes
loop1	sta	$FFDE	; 5		3
	lda	,x	; 4		2
	sta	$FFDF	; 5		3
	sta	,x+	; 6		2
	cmpx	#$FF00	; 4		3
	bne	loop1	; 3		2

The loop is 15 bytes, taking 27 cycles for each iteration. Since we copy one byte per iteration and we have 35,512 iterations, it takes 877,824 cycles to run (there are no pipe lines or caches to worry about, so this will always take 877,824 cycles to run). Given the Color Computer runs at .89 MHz (yes, it's not a fast computer) this is nearly a second to copy 32K of RAM.

Can we do better?

Well, yes. Should we is another matter. I mean, this code is typically run once, so does it matter if it takes a second to run? Meh. It'll take longer to load the code from disk, but hey, I'm doing recreational retro programming and want to have a bit of fun.

The first and easiest optimization is to read and write 16 bits at a time instead of 8. And that's easy enough to do—the 6809 does have a 16-bit accumulator so it's an easy change:

			; cycles	bytes
loop2	sta	$FFDE	; 5		3
	ldd	,x	; 5		2
	sta	$FFDF	; 5		3
	std	,x++	; 8		2
	cmpx	#$FF00	; 4		3
	bne	loop2	; 3		2

The loop now takes 30 cycles per iteration, but it's doing 16-bits per iteration instead of 8. The code now takes 487,680 cycles, which is almost half the time, and we've cut the cycles per byte to 15 from 27, and the size of the code hasn't changed. Not bad for a simple optimization.

But doing 4 bytes per iteration should be better, right? It'll take another register (which we have) and some additional bytes, but yes, it is better:

			; cycles	bytes
loop3	sta	$FFDE	; 5		3
	ldd	,x	; 5		2
	ldu	2,x	; 6		2
	sta	$FFDF	; 5		3
	std	,x++	; 8		2
	stu	,x++	; 8		2
	cmpx	#$FF00	; 4		3
	bne	loop3	; 3		2

Each iteration now takes 44 cycles, but we're moving 4 bytes per iteration, giving us 11 cycles per byte, and it takes a total of 357,632 cycles. Again an improvement but we're starting to hit diminishing improvements. Doing 6 bytes per iteration (still easy to add) takes us down to 9.833 cycles per byte, and 8 bytes per iteration takes us down to 9.24 cycles per byte, but we require the use of the S register (the system stack pointer, which needs to be saved and restored) to do so. It's not worth trying to use the last register available to us (DP) because you can't load it directly. The loop also increases in size, from 19 bytes (shown above) to 23 bytes for the 8-bytes-per-iteration version. Also, the upper address will have to change for the 6 byte version, since 6 doesn't cleanly divide 32,512. It's not a show stopper, since not all the memory in the upper 32K contains useful code, so a few bytes not copied won't crash BASIC.

But we can still do better!

Taking inspiration from “A Great Old-Timey Game-Programming Hack” we can copy 8 bytes per iteration (first, the commented code, then the code with the cycles/bytes columns):

	orcc	#$50	; disable interrupts
	sts	savesp	; we need the S register
	lds	#$FF00-8; and because we're using the stack,
			; we need to start at the top of ROM
			; and work our way down
loop4	sta	$FFDE	; enable ROM mapping
	puls	u,x,y,d	; pull 4 2-byte registers from memory (read memory)
	sta	$FFDF	; enable RAM
	pshs	u,x,y,d	; push 4 2-byte registers to memory (write memory)
	leas	-8,s	; point to next 8-byte block to transfer
	cmps	#$8000-8; are we done?
	bne	loop4
	lds	savesp	; restore S register
	andcc	#$AF	; enable interrupts

The PULS instruction pulls the listed registers off the stack, and the PSHS instruction pushes the listed registers onto the stack. The order of registers in the instructions doesn't matter; they get pushed and pulled such that it all works out.

			; cycles	bytes
loop4	sta	$FFDE	; 5		3
	puls	u,x,y,d	; 13		2
	sta	$FFDF	; 5		3
	pshs	u,x,y,d	; 13		2
	leas	-8,s	; 5		2
	cmps	#$8000-8; 5		4
	bne	loop4	; 3		2

The main loop is now 18 bytes, so it's on par with the third version. But each iteration now takes 49 cycles, but given it moves 8 bytes per iteration, we get an effective rate of 6.125 cycles per byte, and a total time of 199,136 cycles. We could do nine bytes per iteration (as the PSHS and PULS instructions support the DP register) to get us down to 5.666 cycles per byte (184,235 cycles total). We could also add in the CC register for a total of 10 bytes per iteration (5.3 cycles per byte; 172,314 cycles total) but we run the risk of enabling interrupts at the wrong time, unless we disable interrupts at the hardware level (which we can do, but it's more code and more invasive of system state). You can see we'd be getting into diminishing returns again, so I'm happy with just doing 8 bytes per iteration.

Summary of results of copying 32,512 bytes from ROM to RAM
loop	cycles/iteration	cycles/byte	cycles total	bytes/iteration	code size
loop1	27	27.000	877,824	1	15
loop2	30	15.000	487,680	2	15
loop3	44	11.000	357,632	4	19
loop4	49	6.125	199,136	8	18
(theoretical)	53	5.300	172,314	10	18

Is this important these days? Not really. Is it fun? Yes, it certainly beats Enterprise Agile any day of the week.

Discussions about this entry

Wednesday, March 22, 2023

Preloading Lua modules, part III

I received an email from Andy Weidenbaum today, thanking me for writing a post about embedding Lua code into an executable which helped him in his project. On the plus side, it's nice that a post of mine was able to help him. On the non-plus side, I wrote that post ten years ago tomorrow!

Geeze, where does the time go?

I replied to him saying that I have since updated the code (and sent him some copies) but I think I should at least mention it here on the blog. The first major change is populating the Lua array package.preload with the C-based Lua modules (which is probably the intent of that array). The second major change was compressing the Lua-based Lua modules using zlib to save space (and the decompression time on modern systems is near negligible). I accomplish this via a custom tool and the following rule in the makefile:

BIN2C = bin2c

%.o : %.lua
	$(BIN2C) -9 -o $*.l.c -t $(NS)$(*F) $<
	$(CC) $(CFLAGS) -c -o $@ $*.l.c
	$(RM) $*.l.c

This defines an implicit rule to convert a .lua file into a .o (object) file that can be linked. The first line will read the Lua source file, compress it (via the -9 option for heaviest compression) and convert it to a C file. The C file is then compiled into an object file, and the C file is then removed as it's no longer needed.

The third major change is the routine now expects to initialize the entire Lua state, set up the embedded C and Lua based modules, then run the Lua “application” by name (I went from making a Kitchen Sink Lua interpreter to stand alone apps mainly written in Lua).

There are other minor changes, but I think at this point it's best to show some code:

#include <stdlib.h>
#include <string.h>
#include <assert.h>

#include <zlib.h>

#include <lua.h>
#include <lauxlib.h>
#include <lualib.h>

#if LUA_VERSION_NUM == 501
#  define lua_rawlen(L,idx)       lua_objlen((L),(idx))
#  define luaL_setfuncs(L,reg,up) luaI_openlib((L),NULL,(reg),(up))
#  define SEARCHERS               "loaders"
#  define lua_load(L,f,d,k,x)     (lua_load)((L),(f),(d),(k))
#else
#  define SEARCHERS               "searchers"
#endif

/**************************************************************************/

typedef struct prelua_reg
{
  char          const *const name;
  unsigned char const *const code;
  size_t        const *const size;
} prelua_reg__s;

struct zlib_data
{
  char const *name;
  z_stream    sin;
  Bytef       buffer[LUAL_BUFFERSIZE];
};

/**************************************************************************/

static char const *preloadlua_reader(lua_State *L,void *ud,size_t *size)
{
  assert(L    != NULL);
  assert(ud   != NULL);
  assert(size != NULL);
  
  struct zlib_data *data = ud;
  (void)L;
  
  data->sin.next_out  = data->buffer;
  data->sin.avail_out = sizeof(data->buffer);
  
  inflate(&data->sin,Z_SYNC_FLUSH);
  
  *size = sizeof(data->buffer) - data->sin.avail_out;
  return (char const *)data->buffer;
}

/*************************************************************************/

static int preload_lualoader(lua_State *const L)
{
  char          const *key = luaL_checkstring(L,1);
  prelua_reg__s const *target;
  
  assert(L != NULL);
  
  lua_getfield(L,LUA_REGISTRYINDEX,"org.conman:prelua");
  lua_getfield(L,-1,key);
  target = lua_touserdata(L,-1);
  
  if (target == NULL)
  {
    lua_pushfstring(L,"\n\tno precompiled module '%s'",key);
    return 1;
  }
  else
  {
    struct zlib_data data;
    
    data.sin.zalloc   = Z_NULL;
    data.sin.zfree    = Z_NULL;
    data.sin.opaque   = Z_NULL;
    data.sin.next_in  = (Byte *)target->code;
    data.sin.avail_in = *target->size;
    
    inflateInit(&data.sin);
    lua_load(L,preloadlua_reader,&data,key,NULL);
    inflateEnd(&data.sin);
    lua_pushliteral(L,":preload:");
    return 2;
  }
}

/***********************************************************************/

int exec_lua_app(
        lua_State           *L,
        char          const *modname,
        luaL_Reg      const *preload,
        prelua_reg__s const *prelua,
        int                  argc,
        char               **argv
)
{
  int preluasize;
  
  assert(L       != NULL);
  assert(modname != NULL);
  assert(preload != NULL);
  assert(prelua  != NULL);
  assert(argc    >  0);
  assert(argv    != NULL);
  
  for (preluasize = 0 ; prelua[preluasize].name != NULL ; preluasize++)
    ;
    
  lua_createtable(L,0,preluasize);
  
  for (int i = 0 ; i < preluasize ; i++)
  {
    lua_pushlightuserdata(L,(void *)&prelua[i]);
    lua_setfield(L,-2,prelua[i].name);
  }
  
  lua_setfield(L,LUA_REGISTRYINDEX,"org.conman:prelua");
  luaL_openlibs(L);
  lua_getglobal(L,"package");
  lua_getfield(L,-1,"preload");
  luaL_setfuncs(L,preload,0);
  lua_getglobal(L,"package");
  lua_getfield(L,-1,SEARCHERS);
  
  for (lua_Integer i = lua_rawlen(L,-1) + 1 ; i > 1 ; i--)
  {
    lua_rawgeti(L,-1,i - 1);
    lua_rawseti(L,-2,i);
  }
  
  lua_pushinteger(L,2);
  lua_pushcfunction(L,preload_lualoader);
  lua_settable(L,-3);
  lua_pop(L,4);
  
  lua_createtable(L,argc,0);
  for (int i = 0 ; i < argc ; i++)
  {
    lua_pushinteger(L,i);
    lua_pushstring(L,argv[i]);
    lua_settable(L,-3);
  }
  lua_setglobal(L,"arg");
  
  lua_pushcfunction(L,preload_lualoader);
  lua_pushstring(L,modname);
  lua_call(L,1,1);

  return lua_pcall(L,0,0,0);
}

And to show how it's used, I'll use my gopher server as an example:

int main(int argc,char *argv[])
{
  static luaL_Reg const c_preload[] =
  {
    { "lpeg"                  , luaopen_lpeg                  } ,
    { "org.conman.clock"      , luaopen_org_conman_clock      } ,
    { "org.conman.errno"      , luaopen_org_conman_errno      } ,
    { "org.conman.fsys"       , luaopen_org_conman_fsys       } ,
    { "org.conman.fsys.magic" , luaopen_org_conman_fsys_magic } ,
    { "org.conman.math"       , luaopen_org_conman_math       } ,
    { "org.conman.net"        , luaopen_org_conman_net        } ,
    { "org.conman.pollset"    , luaopen_org_conman_pollset    } ,
    { "org.conman.process"    , luaopen_org_conman_process    } ,
    { "org.conman.signal"     , luaopen_org_conman_signal     } ,
    { "org.conman.syslog"     , luaopen_org_conman_syslog     } ,
    { "port70.getuserdir"     , luaopen_port70_getuserdir     } ,
    { "port70.setugid"        , luaopen_port70_setugid        } ,
    { NULL                    , NULL                          }
  };

  static prelua_reg__s const c_prelua[] =
  {
    { "org.conman.const.exit"            , c_org_conman_const_exit            , &c_org_conman_const_exit_size            } ,
    { "org.conman.const.gopher-types"    , c_org_conman_const_gopher_types    , &c_org_conman_const_gopher_types_size    } ,
    { "org.conman.net.ios"               , c_org_conman_net_ios               , &c_org_conman_net_ios_size               } ,
    { "org.conman.nfl"                   , c_org_conman_nfl                   , &c_org_conman_nfl_size                   } ,
    { "org.conman.nfl.tcp"               , c_org_conman_nfl_tcp               , &c_org_conman_nfl_tcp_size               } ,
    { "org.conman.parsers.abnf"          , c_org_conman_parsers_abnf          , &c_org_conman_parsers_abnf_size          } ,
    { "org.conman.parsers.ascii.char"    , c_org_conman_parsers_ascii_char    , &c_org_conman_parsers_ascii_char_size    } ,
    { "org.conman.parsers.ascii.control" , c_org_conman_parsers_ascii_control , &c_org_conman_parsers_ascii_control_size } ,
    { "org.conman.parsers.ip-text"       , c_org_conman_parsers_ip_text       , &c_org_conman_parsers_ip_text_size       } ,
    { "org.conman.parsers.iso.char"      , c_org_conman_parsers_iso_char      , &c_org_conman_parsers_iso_char_size      } ,
    { "org.conman.parsers.iso.control"   , c_org_conman_parsers_iso_control   , &c_org_conman_parsers_iso_control_size   } ,
    { "org.conman.parsers.mimetype"      , c_org_conman_parsers_mimetype      , &c_org_conman_parsers_mimetype_size      } ,
    { "org.conman.parsers.url"           , c_org_conman_parsers_url           , &c_org_conman_parsers_url_size           } ,
    { "org.conman.parsers.url.gopher"    , c_org_conman_parsers_url_gopher    , &c_org_conman_parsers_url_gopher_size    } ,
    { "org.conman.parsers.utf8.char"     , c_org_conman_parsers_utf8_char     , &c_org_conman_parsers_utf8_char_size     } ,
    { "org.conman.parsers.utf8.control"  , c_org_conman_parsers_utf8_control  , &c_org_conman_parsers_utf8_control_size  } ,
    { "port70"                           , c_port70                           , &c_port70_size                           } ,
    { "port70.cgi"                       , c_port70_cgi                       , &c_port70_cgi_size                       } ,
    { "port70.handlers.content"          , c_port70_handlers_content          , &c_port70_handlers_content_size          } ,
    { "port70.handlers.file"             , c_port70_handlers_file             , &c_port70_handlers_file_size             } ,
    { "port70.handlers.filesystem"       , c_port70_handlers_filesystem       , &c_port70_handlers_filesystem_size       } ,
    { "port70.handlers.http"             , c_port70_handlers_http             , &c_port70_handlers_http_size             } ,
    { "port70.handlers.sample"           , c_port70_handlers_sample           , &c_port70_handlers_sample_size           } ,
    { "port70.handlers.url"              , c_port70_handlers_url              , &c_port70_handlers_url_size              } ,
    { "port70.handlers.userdir"          , c_port70_handlers_userdir          , &c_port70_handlers_userdir_size          } ,
    { "port70.mklink"                    , c_port70_mklink                    , &c_port70_mklink_size                    } ,
    { "port70.readfile"                  , c_port70_readfile                  , &c_port70_readfile_size                  } ,
    { "port70.safetext"                  , c_port70_safetext                  , &c_port70_safetext_size                  } ,
    { "re"                               , c_re                               , &c_re_size                               } ,
    { NULL                               , NULL                               , NULL                                     }
  };
  
  lua_State *L = luaL_newstate();
  if (L == NULL)
  {
    fprintf(stderr,"Cannot create Lua state\n");
    return EXIT_FAILURE;
  }
  
  int rc = exec_lua_app(L,"port70",c_preload,c_prelua,argc,argv);
  if (rc != LUA_OK)
    fprintf(stderr,"lua_pcall() = %s\n",lua_tostring(L,-1));
    
  lua_close(L);
  return rc == LUA_OK ? EXIT_SUCCESS : EXIT_FAILURE;
}

I haven't yet written code to automagically write the main() routine, but for a small application like port70 it wasn't that bad. There are several approaches to automating this, like overriding require() with some custom code to record loaded modules, or building the list from LuaRocks, but that's a project for another day.

Monday, May 01, 2023

The Case of the Inconsistent Consistent Chirp

Bunny and I were plagued with the most insidious inconsistently consistent chirp over the past few days here at Chez Boca. There would be this distinct chirp. Just one. And by the time you think it won't happen again, it would happen again. And then … nothing. For hours. Or maybe the rest of the day even. But sure enough, it would pick up again—a single chirp, then silence, then maybe another chirp, repeat for a few minutes then, nothing more for hours.

When it first started, I thought maybe one of the UPSes was responding to some power fluctuation, but no, they squeal quite loudly, and none of them showed any form of distress when I checked. This was more of a short chirp than a loud squeal. And by the time I was tired of looking at whatever UPS I thought it might be and turn away, there was another chirp.

It was mocking us.

Between the two of us, we had narrowed down the possible source in Chez Boca, somewhere along the west wall of the house. The only things in the area that could possible chirp were:

a large vertical floor fan;
a USP for the TV system;
the TV itself;
the DVR;
the DVD player;
the small network router for the DVR;
a floor lamp.

But all these devices had been there for years before this chirping had started. It was weird as it was maddening.

I even went so far as to check the bathroom, as, from where I sit in the Computer Room, the chirp could be coming from there. The only three things in the bathroom that could possible chirp: the lights, Bunny's electric toothbrush and a small clock.

I discounted the lights—they're the original fixtures from the 70s—no strange electronics in there, and more importantly, no speakers to speak of. I did unplug the base unit of Bunny's electric toothbrush, and had the toothbrush itself in the Computer Room. The chirp didn't go away, and it wasn't from the toothbrush. Nor was it from the small electric clock (I too, brought that in to the Computer Room and cleared it as a suspect).

Then, late Saturday night, I was in the family room along with Bunny when it happened again. We were standing far enough apart that it appeared to be the floor lamp just by simple tiangulation. During an examination of the lamp, I happened to glance up, and there, above the door to the Computer Room, was a small, round disk shaped device stuck to the wall—a smoke detector.

Bingo!

Taking the unit down and reading the back, yes, it would chirp to indicate the battery needed changing. And neither Bunny nor I could recall when the battery in the unit was changed.

Heck, we both forgot about the unit being there at all.

Even worse, when I started telling this story to some friends at our regularly scheduled D&D game, they knew the punchline even before I finished. Sigh.

Now I just have to figure out why our ice maker is making hollow ice.

Wednesday, May 10, 2023

Proportional fonts for coding? No thank you

There's some back and forth in the Gemini community about coding with a proportional font. You can pry my monospace font from my cold dead hands.

I've been coding for nearly 40 years now, and it's always been some form of a monospace font, some pretty, like the character set for VGA on IBM PCs, and some not to pretty, like the character set on the TRS-80 Color Computer. Code in a proportional font just looks weird to me.

My first language was BASIC on the TRS-80 Color Computer, and due to limitations on the video screen and memory constraints, pretty much any non-trivial BASIC program ends up looking something like;

1445 X=FREE(PEEK(4670)):Y=FREE(P
EEK(4671)):IF X<2ORY<2 THEN PRIN
T"MESSAGE BASE FULL!":RETURN :EL
SE IFML>0THEN P$="10000000":GOTO
1450:ELSEIFPF=0THEN P$="00000000
":GOTO1450:ELSEPRINT"MESSAGE PRI
VATE (Y/N)? ";:GOSUB625
1446 IFCH$="Y"THEN P$="10000000"
:PRINT"YES":ELSEIFCH$="N"THEN P$
="00000000":PRINT"NO":ELSEGOSUB6
25:GOTO1446
1450 K=LEN(MF$)+LEN(MT$)+LEN(MS$
)+2:IFK>64THENPRINT"SUBJECT TOO 
LONG":PRINT"LIMIT TO ";64-LEN(MF
$)-LEN(MT$)-2:PRINT"TRUNICATING.
." :ELSE 1452
1451 IFLEFT$(MS$,5)="REPLY"THEN 
MS$=RIGHT$(MS$,LEN(MS$)-(K-64)) 
:ELSE MS$=LEFT$(MS$,LEN(MS$)-(K-
64)):GOTO1450
1452 GOSUB25:PRINT:PRINT:PRINTTA
B(5)"FROM: ";MF$:PRINTTAB(5)"  T
O: ";MT$:PRINTTAB(5)"SUBJ: ";MS$
1453 IFP$="10000000"THENPRINTTAB
(5)"PRIVATE MESSAGE":ELSEPRINTTA
B(5)"PUBLIC MESSAGE"
1455 IF ML=2 THEN 1465 :ELSE PRI
NT:PRINT"CORRECT (Y/N)? ";
1460 GOSUB600:K=INSTR("NnYy",CH$
):IFK>2THENPRINT"YES":GOTO1463:E
LSEIFK>0THEN1415:ELSE1460
1463 PRINT:PRINT
1465 PRINT:PRINT"ENTER MESSAGE. 
MAXIMUM OF 2000":PRINT"BYTES. MA
XIMUM OF 40 LINES.":PRINT"PRESS 
<ENTER> ON LINE BY ITSELF":PRINT
"TO EXIT.":PRINT:LE=0:EXEC&H10DA

Yes, you can pretty much get used to any type of formatting if you have to. Fortunately, you no longer have to.

The next few languages I picked up were various assembly languages, which are nearly always vertically aligned:

;--------------------------------------------------------
;	SPHEX4		Display a signed word as hex
;Entry:	D - word
;	U - buffer
;Exit:	U - U + 4 (or 5)
;--------------------------------------------------------

sphex4		tsta			; negative?
		bpl	sphex42		; nope
		stb	,-s		; save B
		ldb	#'-		; print leading minus
		stb	,u+
		ldb	,s+
		coma			; negate D
		comb
		addd	#1
sphex42		bsr	phex2		; print high byte
		tfr	b,a		; now print low byte
		bra	phex2

The decade or so of this left me with an “assembly accent” (which you can pick up on in this post). That, along with some other … quirks in formatting, makes it pretty easy to tell I've been working on the code. I've been developing my C style for over 30 years, and my opinion on “code formatters” is … well … if I didn't want opinions, I'd join a cult. More opinionated—if you have no coding style of your own, you have no soul and probably enjoy The Enterprise Agile being shoved down your throat [Tell us how you really feel! —Editor]. Or at least don't mind it.

But getting back to coding with a proportional font. The original article presents the same code fragment in a monospace font:

import 'dart:io';

/// Replaces typewriter quotes and double dashes in all '.gmi' files under
/// the specified path with their nicer unicode equivalents.
///
/// Usage: dart fix_typography.dart <root path>

void main(List<String> arguments) {
  final gmis = Directory(arguments[0])
      .listSync(recursive: true)
      .whereType<File>()
      .where((f) => f.path.endsWith('.gmi'));
  for (final gmi in gmis) {
    print('Fixing ${gmi.path}.');
    final lines = gmi.readAsLinesSync();
    var skip = false;
    for (var i = 0; i != lines.length; ++i) {
      var line = lines[i];
      if (line.startsWith('```')) {
        skip = !skip;
        continue;
      }
      if (skip) continue;
      line = line.replaceAll("'","’");
      line = line.replaceAll('--','—');
      line = line.replaceAllMapped(RegExp(r'"(\w)'), (m) => '"${m.group(1)}');
      line = line.replaceAllMapped(RegExp(r'(\w)"'), (m) => '${m.group(1)}"');
      lines[i] = line;
    }
    gmi.writeAsStringSync(lines.join('\n'));
  }
}

(Typos mine as this is transcribed from an image; also, sans syntax highlighting.)

And in a proportional font:

import 'dart:io';

/// Replaces typewriter quotes and double dashes in all '.gmi' files under
/// the specified path with their nicer unicode equivalents.
///
/// Usage: dart fix_typography.dart <root path>

void main(List<String> arguments) {
  final gmis = Directory(arguments[0])
      .listSync(recursive: true)
      .whereType<File>()
      .where((f) => f.path.endsWith('.gmi'));
  for (final gmi in gmis) {
    print('Fixing ${gmi.path}.');
    final lines = gmi.readAsLinesSync();
    var skip = false;
    for (var i = 0; i != lines.length; ++i) {
      var line = lines[i];
      if (line.startsWith('```')) {
        skip = !skip;
        continue;
      }
      if (skip) continue;
      line = line.replaceAll("'","’");
      line = line.replaceAll('--','—');
      line = line.replaceAllMapped(RegExp(r'"(\w)'), (m) => '"${m.group(1)}');
      line = line.replaceAllMapped(RegExp(r'(\w)"'), (m) => '${m.group(1)}"');
      lines[i] = line;
    }
    gmi.writeAsStringSync(lines.join('\n'));
  }
}

To me, the proportional font crushes the indentation too much for my liking, making it harder for me to “see” the structure of the code. Of course, the original image has low-contrast vertical bars showing each indenting level, but I suspect that's an IDE-specific thing to help show the structure (I'm not a fan of IDEs for various reasons). And using color for information isn't exactly nice to the color-blind. Why not italic for variables? Bold for keywords? You're already using a proportional font, you might as well use font properties for visual information, but I digress. It just looks too scrunched up for my liking.

I think this just comes down to it's totally alien to my way of thinking …

Friday, September 08, 2023

Welcome back!

Hello! Long time, no entries.

This isn't the longest time I've been absent here at the ol' blog (the longest stretch has been 4½ months back in late 2012), but things around Chez Boca have been interesting for the past three and a half months. The biggest thing is a medical issue that Bunny has been going through. It's not life threatening, but it is life changing for the both of us, and the doctors are still trying to figure out what happened in late April that caused the issue. I've also been forced to deal with the medical-industrial complex and the bureaucracy surrounding it (Bunny used to deal with the medical-industrial complex, having been a former Fed herself and can stomach the bureaucracy) and while I have plenty to say about it, I'll refrain least I work myself up.

Another issue has been that my primary development system (a Linux system) has been offline for the past few months. July 3^rd we had a small power outtage. It normally wouldn't be a big deal as the UPS kept the system up until I could shut it down cleanly, but when power was restored, the computer just refused to turn on. And given the situation with Bunn—

A week later and I can resume writing this entry. As I was writing, the situation with Bunny was such that I didn't feel up to mucking with the computer. It was just last week that I felt up to getting a replacement power supply. I ordered it from Amazon and it arrived the next day, not in a box, but wrapped in foam and packing tape stuffed inside an opaque plastic bag. I was not surprised in the least that the fan inside the power supply was broken. Not bad enough that a bit of cyanoacrylate glue (aka “Super Glue”) wouldn't fix it, but still, the fact that I had to do that wasn't a good sign (“Why don't you just return it?” asked Bunny? “Because I'm desperate enough to get my system up and running.”)

It was enough to get the system up and running, and as I was typing out this entry the new power supply went “POP” and that was the end of that.

Sigh.

I opted to return it for a replacement. It arrived yesterday, wrapped in foam and packing tape stuffed inside an opaque plastic bag, but this time the fan was fine, and it's now been running for over 24 hours without incident.

Saturday, September 09, 2023

How common is it for people to not know their own email address?

I'm still receiving all sorts of email from other Sean Conners to my sean.conner@gmail.com address, and I'm seriously wondering how? Do these people not know their email address? Currently:

I own a condo in Austin, Texas;
I have a doctor's appointment in Lake Wales, FL;
I've subscribed to receive information about a blood-clotting prescription I have;
and I still have a young child in elementary school.

I was able to call and stop the emails about the condo, doctor's appointment and medication, but for some reason, the administrators of the school in Tenneesee can't remove my email address unless they get permission from the parent of the actual child, and they won't tell me the name of the parent who thinks I need their child's school notices.

Sigh.

To make matters worse, in one case, the doctor's appointment case, the name of the patient wasn't even “Sean Conner!”

What?

How? Just … how?

Wednesday, September 13, 2023

Dear Walmart … seriously? That's what you keep under lock and key?

You keep alcohol wipes under lock and key?

Are they that valuable?

You do realize that we repealed the 18^th Amendment, right?

Sigh.

Thursday, September 14, 2023

Stirring and shaking may be boring, but the future this brings will effect you in the future

This all works fine for the purposes of the telephone system. I mean, at least for a long time, it did. But have you noticed what's up with email lately? It seems that, given an open communications system, people will inevitably develop something called a "cryptocurrency" and badly want to make sure that you get in on something called an "ICO." The general term for this phenomenon is "spam," and the fact that it is only one letter away from "scam" is meaningful as the line between mere unsolicited advertising and outright crime is often razor thin.

In the email system, this problem has been elegantly solved by a system of ad-hoc, inconsistent, often-wrong heuristic classifiers glued to a trainwreck of different cryptographic attestation and policy metadata schemes that still haven't solved the problem. It is, perhaps, no surprise that the phone system is taking a generally similar approach.

2023-08-07 STIRred AND SHAKEN

The whole STIR/SHAKEN thing first crossed my path a few years ago at The Enterprise. At the time, I wasn't sure what the difficulty was in stopping spam/robo calls and that the Oligarchic Cell Phone Companies were complicit with said calls because it made them money. The actual story, covered in the above article, is much more complicated and nuanced than my own cynical take on it (worth reading, even if it's a bit long). By the time I left The Enterprise, we were starting to support it with our offering (which was “Caller Name ID”—that is, given a phone number, map that back to a name), along with a process that was attempting to classify the originating side of the call as legit or not if the call wasn't attested (that was being done at another department within The Enterprise). If you use a certain Oligarchic Cell Phone Company, and see the name “Potential SPAM” as the caller name, you were using code I worked on.

Tuesday, September 19, 2023

How long until I receive some really damaging information about another Sean Conner out there?

Yet another email for Sean Conner. This time, I own a 2015 Acura RDX with XXXXXX miles (XXXXXXX kilometers for those Imperially challenged) on the odometer and Tennessee license plate XXXXXXX, having just received service from Budget Brakes in XXXXXXXXX, Tennessee. That's only 875 miles (1,410 km) from Chez Boca (and for reference, the closest Budget Brakes to me is in Pensacola, Florida, 630 miles (1010 km) away).

I'm … just speechless … that this keeps happening! Don't people know their own email address? Do companies just assume customers have firstnamelastname@gmail.com? Why does this keep happening?

Friday, September 22, 2023

Failures in customer facing user interfaces

I was at a local fast food establishment getting lunch when I found myself behind a gentleman attempting to work the self-serve soda fountain. It had a touch screen where you navigated down a series of menus to your selection of tasty beverage to be dispensed from the single nozzle below the screen. The gentleman just didn't know how to use the device. He would select a soft drink category, the screen would then show a bunch of selections with round buttons on the screen, with one button being slightly above the rest, and larger. He would press other buttons, each one would jump up slightly and enlarge, the previous one would jump down and shrink, but the gentleman never clued in that the one button that was larger was “selected” and that he should then place his cup below the nozzle and press the physical dispense button.

This went on for several minutes before he turned to me and lamented that the machine wasn't working. I then pointed out to him how the machine worked. He thanked me, got his preferred drink dispensed into his cup and left.

I'm not sure what to make of this. Obviously, the makers of the soda dispensing machine thought about the UI but the fact that the gentleman in front of me couldn't figure it out shows that it wasn't entirely intuitive as the makers wanted it to be. I, knowing how the computer sausage is made, and having used various UIs over the decades, knew how to navigate the machine despite not knowing Spanish (the currently selected language with no obvious way that I saw to change it).

It's not an easy problem to solve. I have had problems using the self-checkout lanes at the grocery store when an item I have doesn't have a bar code on it, like fresh produce, or the bar code that is on the item doesn't scan for some reason. The interface itself may make it obvious that one can search for the item, but what to type? I recently had issues with a green pepper, and the solution was to look up “pepper” and select “green pepper” from the list, not to search for “green pepper.”

User studies? What are those?

I'm not optimistic that this will improve over time.

Discussions about this entry

RE: Failures in consumer facing user interfaces

Monday, September 25, 2023

The scene that always plays out at Chez Boca

“You know today is the day you get your hair cut.”

“You didn't have to go to all the trouble of setting up an appointment for me.”

“It was my pleasure.”

“But it hasn't been that long since my last haircut.”

“It's been over two months.”

“Yes, it hasn't been that long.”

“We're going now.”

“Do I have to? Hey! Ouch! Stop pulling my ear! Ow! Ow! Ow! Ow! Okay! I'll come.”

“Don't forget your keys.”

“You mean I have to drive as well?”

And with that, Bunny and I went to my hair cut appointment.

An unexpected nostalgic hit at the local barber surgeon shop

Bunny and I arrive at barber shop to await my ~~doom~~ haircut. At the back of the shop I see something I haven't seen in years—nay, decades! A video arcade machine!

I approach the cabinet and it's clear it's not an original machine at all, but a modern recreation with dozens of games available to play. Not as cool as an original machine, but still, a cool thing to find in a commercial establishment in 2023! The last time I saw a video arcade machine in a commercial, non-arcade setting was … the mid to late 80s?

While awaiting my turn in the chair, I managed to play a few games like Galaga, Qix, and Burger Time. Thirty-five years later and I still can't play these games. But it seems no one can play these games (or perhaps, the cabinet itself was only very recently installed) as I was able to get the top score on the games I played.

I'm not sure if it's just nostalgia, but I swear there were more diverse video games back in the 80s than today. Not only did you have the swarming alien invaders genre, but the the maze genre (Pacman), side scrollers (like Jungle King, platformer genre (like Donkey Kong or the above Burger Time), vertical scroller genre (like Spy Hunter or Crazy Climber), or the isometric genre (like Marble Madness or Zaxxon), and don't forget the vector games and the very crisp graphics. I'm probably forgetting a few other genres. These days, it seems it's all first-person shooters.

My time with the games was cut short as it was my turn in the barber chair.

Afterwards, as I was showing my barber how to play Qix (I don't think he was born yet when Qix came out), Bunny asked if she should get one for me for Christmas. I told her no, because it was just too addictive to have at home.

Changing the historical record of my blog

Twenty-one years ago I was worried about loosing the historical presentation of my blog both because it was template driven, and through the use of CSS. Changes that effect everything at once certainly appeared quite Orwellian to me, although I might be in a very small minority in worring about this.

And yet, since then, I've tweaked the CSS quite a bit since I wrote that. I figure I'm not changing the content, so it's okay. right?

It was over a year ago when I noticed that a lot of my earlier entries had the initial paragraph shifted over to the left, due to a change in the template file I made around 2003. The old template had an initial <P> tag so I didn't have to type it, and the new one removed said tag. That left maybe a thousand posts (give or take) that needed fixing. I started doing the job manually at first, then gave up at the sheer number of posts to fix. Again, it was not changing the content but fixing the presentation. And it bothered me that there were posts that weren't formatted correctly.

About a week or two ago, I realized that the markup I used for foreign words:

<span lang="de" title="My hovercraft is full of eels">Mein Luftkissenfahrzeug ist voller Aale</span>

is probably not sematically sound HTML. I even wrote about that issue twenty years ago, and now realize it should be:

<i lang="de" title="My hovercraft is full of eels">Mein Luftkissenfahrzeug ist voller Aale</i>

Around the same time, I read up on the “proper” use of <BLOCKQUOTE> and that the attribution should appear outside the blockquote, not inside as I've been doing for years, even though I was doing The Right Thing™ when I first started blogging, but changed for some reason I long forgot.

And then several days ago, I noticed the sample BASIC code was incorrect and it was bugging me—the keyword THEN would always show up as THENNOT. How that happened is a topic for another post, but in the meantime, I decided to fix the issue without mentioning it. The change didn't change the intended meaning of the post, it was fixing incorrect output, not saying we were always at war with Eastasia.

After that, I decided to go back and fix the “formatting” issues in the blog. I have code that will read entries and parse the HTML I use into into an AST (or should it be a DOM, even though I'm using Lua, not Javascript?) which I use to generate the Gopher and Gemini versions. To fix the initial paragraph issue, all I needed to do was identify the entries that didn't start with a <P> tag and just prefix the raw text with said tag.

To update the HTML for foreign words, it was enough to identify entries with <SPAN LANG="language"> and with some sed magic, switch it to read <I LANG="language"> (and fix the corresponding closing tags). It's just fixing the semantics of the HTML, not changing the past, right?

The fix for the <BLOCKQUOTE> issue wasn't quite so easy—I still had over 700 entries that needed to be fixed, so I ended up writing code that would spit out the parsed HTML back into HTML. It would have been easy to output it as:


<p>I've been following the various Linux <abbr title="Initial Public Offerin
g">IPO</abbr>s and today I see that <a class="external" href="http://www.val
inux.com/">VA Linux Systems</a> had their <a class="external" href="http://d
ailynews.yahoo.com/h/nm/19991209/bs/markets_valinux_1.html">IPO today.</a>. 
 Briefly, it IPOed (can you verb a TLA?  Can you verb the word “verb?” Whate
ver … ) at US$30 and opened at US$299.  Inbloodysane.</p><p><a class="extern
al" href="http://www.andover.net/">Andover.Net</a> wasn't nearly as inbloody
sane.</p>

one long line—the browsers don't care, but I do if I ever have to go back and edit this. Instead, I want the output to still be editable:

<p>I've been following the various Linux <abbr title="Initial Public
Offering">IPO</abbr>s and today I see that <a class="external"
href="http://www.valinux.com/">VA Linux Systems</a> had their <a
class="external"
href="http://dailynews.yahoo.com/h/nm/19991209/bs/markets_valinux_1.html">IPO
today.</a>. Briefly, it IPOed (can you verb a TLA? Can you verb the word
“verb?” Whatever … ) at US$30 and opened at US$299. Inbloodysane.</p>

<p><a class="external" href="http://www.andover.net/">Andover.Net</a> wasn't
nearly as inbloodysane.</p>

That meant handling not only <P> but all the block level tags in HTML, <BLOCKQUOTE>, <TABLE>, <DL> (which I use for emails and screenplay dialog), <UL>, <OL>, and <PRE>. Now that I have that working, I can identify the citation paragraphs for blockquotes, and move them to the appropriate location.

I'm about to do that, yet I'm still a bit hesitent. Yes, it's just fixing the semantic presentation, but now that I have the code to read and write HTML, future mass changes are easy to do.

I'm probably thinking too much on this.

I think.

Tuesday, September 26, 2023

To err is human—to really mess up takes a computer

I ran the code to fix the BLOCKQUOTE issue and as it turns out, there were 54 entries that needed further fixing due to the fix I just applied. I wanted the HTML files to still be editable, and as such, I wrapped the contents of the block-level elements to fit within 80 columns, and the function I used did not take into account where it was safe to break on an HTML tag. So for future reference, I'll have to write a customize word-wrapping function to take into account HTML. As it was, I fixed all 54 entries by hand; some were trivial, some required going into the backups.

Ah, the wonders of automation. A human can mess up, a computer can mess up a lot. Quickly.

The other thing I learned is that the entity ' is not defined for HTML 4 (it is for HTML 5, and XML). This is important because I'm still using HTML 4 for my blog. Why not HTML 5? Because I'm not fond of the “living standard” (read: changes whenever, meaning an ever-constant churn of updating HTML to maintain the standard du jour) and the step-by-step parsing rules instead of a concise syntax. It also doesn't help that whenever I see WHATWG, I read it as “What working group?”

Wednesday, September 27, 2023

A classic blunder, like getting involved in a land war in Asia

The first time I included some BASIC code, I typed in the sample directly from a magazine (like we used to do back in the 1980s). The second (and most recent) time I included BASIC code, it was extracted from a disk image downloaded from the Intarwebs (using code I wrote) and then decoded into ASCII, using code I wrote, based off a text file I also found on the Intarwebs. I didn't notice when I posted the code because it was a wall of text 32 characters wide (the width of the text screen on a Color Computer). It was only months later when I finally noticed all the THENNOTs littering the code.

There was nothing wrong with the actual file, but I did locate the bug in my code:

char const *const c_tokens[] =
{
  "FOR",
  "GO",
  /* ... */
  "SUB",
  "THEN"
  "NOT",
  "STEP",
  "OFF",
  /* ... */
  "DSKO$",
  "DOS"
};

If you look close, you'll see there's a missing comma after the THEN token, and in C, two literal strings separated by whitespace are concatenated into a single string. Thus, all the THENNOTs I was seeing. And a bunch of incorrect code because most of the BASIC keywords were then off-by-one (a classic mistake of C programming).

Friday, September 29, 2023

YAML config file? Pain? Try Lua

Several articles about using YAML for configuration have been making the rounds, yet rarely do I see Lua being mentioned as an alternative for configuration files.

Yes, it's a language, but it started out life as a configuration format. It's also small for a language, easy to embed, and easy to sandbox for the paranoid. Here's an example:

lua_State *gL;

bool config_read(char const *conf)
{
  int rc;
  
  assert(conf != NULL);
  
  /*---------------------------------------------------
  ; Create the Lua state, which includes NO predefined
  ; functions or values.  This is literally an empty
  ; slate.  
  ;----------------------------------------------------*/
  
  gL = luaL_newstate();
  if (gL == NULL)
  {
    fprintf(stderr,"cannot create Lua state");
    return false;
  }
  
  /*-----------------------------------------------------
  ; For the truly paranoid about sandboxing, enable the
  ; following code, which removes the string library,
  ; which some people find problematic to leave un-sand-
  ; boxed. But in my opinion, if you are worried about
  ; such attacks in a configuration file, you have bigger
  ; security issues to worry about than this.
  ;------------------------------------------------------*/
  
#ifdef PARANOID
  lua_pushliteral(gL,"x");
  lua_pushnil(gL);
  lua_setmetatable(gL,-2);
  lua_pop(gL,1);
#endif
  
  /*-----------------------------------------------------
  ; Lua 5.2+ can restrict scripts to being text only,
  ; to avoid a potential problem with loading pre-compiled
  ; Lua scripts that may have malformed Lua VM code that
  ; could possibly lead to an exploit, but again, if you
  ; have to worry about that, you have bigger security
  ; issues to worry about.  But in any case, here I'm
  ; restricting the file to "text" only.
  ;------------------------------------------------------*/
  
  rc = luaL_loadfilex(gL,conf,"t");
  if (rc != LUA_OK)
  {
    fprintf(stderr,"Lua error: (%d) %s",rc,lua_tostring(gL,-1));
    return false;
  }
  
  rc = lua_pcall(gL,0,0,0);
  if (rc != LUA_OK)
  {
    fprintf(stderr,"Lua error: (%d) %s",rc,lua_tostring(gL,-1));
    return false;
  }
  
  /*--------------------------------------------
  ; the Lua state gL contains our configuration,
  ; we can now query it for values
  ;---------------------------------------------*/
  
  /* ... */
  return true;
}

Yes, it's all too possible for someone to write:

(function() while true do end end)()

in the configuration and block the process with 100% CPU utilization, but as I stated in the code example, if that's a worry, you have bigger security issues to worry about.

Another nice benefit of using Lua is string management. If you are only using Lua for the configuration file, and once read, don't execute any more Lua code, then there's no need to duplicate the strings for your codebase—just keep using the strings directly from Lua. As long as you close the Lua state at the end of the program, they'll be cleaned up for you. And speaking of strings, you'll also have Lua's “long strings:”

long_string = [[
This is a long Lua string that can
span several lines.  Escapes like '\n' don't work in this,
but then again,
you don't really need the '\n' here because they're part of the 
string.
]]

long_string_2 = [=[
And if you want to embed a literal ']]' in a long string,
you can, no problems here.
]=]

I use Lua for the configuration file for my blogging engine (to see how I pull the configuration from Lua) which looks like:

name        = "A Blog Grows in Cyberspace"
description = "A place where I talk about stuff in cyperspace."
class       = "blog, rants, random stuff, programming"
basedir     = "."
webdir      = "htdocs"
lockfile    = ".modblog.lock"
url         = "http://www.example.com/blog/"
adtag       = "programming"
conversion  = "html"
prehook     = "./prehook_script"
posthook    = "./posthook_script"

author =
{
  name   = "Joe Blog" ,
  email  = "joe@example.com",
}

templates =
{
  {
    template = "html",
    output   = webdir .. "/index.html",
    items    = "7d",
    reverse  = true,
    posthook = "posthook_template_script"
  },
  {
    template = "atom",
    output   = webdir .. "/index.atom",
    items    = 15,
    reverse  = true,
  },
}

But if you think it'll be too complicated to instruct devops as to when to use a comma and not, you can always include semicolons at the end of each line:

name        = "A Blog Grows in Cyberspace";
description = "A place where I talk about stuff in cyperspace.";
class       = "blog, rants, random stuff, programming";
basedir     = ".";
webdir      = "htdocs";
lockfile    = ".modblog.lock";
url         = "http://www.example.com/blog/";
adtag       = "programming";
conversion  = "html";
prehook     = "./prehook_script";
posthook    = "./posthook_script";

author =
{
  name   = "Joe Blog";
  email  = "joe@example.com";
};

templates =
{
  {
    template = "html";
    output   = webdir .. "/index.html";
    items    = "7d";
    reverse  = true;
    posthook = "posthook_template_script";
  };
  {
    template = "atom";
    output   = webdir .. "/index.atom";
    items    = 15;
    reverse  = true;
  };
};

to simplify the configuration instructions (“just add a semicolon to the end of each line … ”). One other benefit—comments. That's one of the biggest complaints about using JSON as a configuration file format—a lack of comments.

Also, you don't even have to mention you are using Lua as a configuration file—most likely, no one will really notice anyway. I used Lua to configure “Project: Sippy-Cup” and “Project: Cleese” at The Enterprise and no one said anything about the format. It was also used for “Project: Bradenburg” (written by another team member) and there were no issues with that either.

Discussions about this entry

Saturday, October 07, 2023

Mowing Da Lawn

I've taken over mowing the lawn over the past few months, and I've learned a thing or two:

The electric mower is much nicer than the gas mower. Both are self-propelled, but the speed control is nicer on the electric than the gas. Also, the electric mower is easier to start.
One bad thing about the electric mower—the grass accumulates inside, and when it gets too thick, the mower stops. I then have to tip it over and scoop out the cut grass. The gas mower does a much better job of mulching the grass.
Also, the electric mower has a safety mechanism where two latches that lock down the handle length (it can slide in and out, shortening or lengthening the handle) must be in the locked position, or the cutting blade won't start (just learned this today). Interesting.
Over the past few months, I've learned that I can finish the entire yard (front, sides and back) with the two batteries we have, but only if the lawn isn't terribly overgrown. And by “overgrown” I mean “longer than two weeks” (we've been having a lot of rain lately).
Moving the cars out of the driveway makes mowing the front lawn easier, as I can just simple go back and forth, crossing the driveway. I wish I could do that for the entire lawn, but alas, there are some trees, a fence, and oh yes, the house, in the way.
Chez Boca is on a … I don't want to say a hill, because get real, we're in Florida where we give Kansas a run for its money for “Flattest State in America.” But there is an incline you can feel when walking towards the house. I suspect that when it rains, the water seeps into the ground, and because the ground here is nothing but sand, the water slowly seeps down the incline towards the street. I say this, because there's a two-foot strip of grass along the road that is gorgeous, but three weeks makes it a bit too long to cut with the electric mower without having to stop several times to clean it out.
The north side of Chez Boca is the most annoying section to mow. It's too narrow—there's a fence on one side, the house on the other. There's also a large tree on the east end, and a shed on the west end, so there's no long stretches to mow. I now do that section after the front yard.
The north-west side of Chez Boca is the second most annoying section to mow. It's next to the house, There are two small trees in the way (I mean, I like trees, but mowing around them, especially given they're very low to the ground, is annoying) and I've smashed several sprinkler heads a few times. I do that after doing the north side, as I'm not completely wrecked yet.
I do the back yard last, when I'm completely wrecked from doing the north and north-west portions—like the front yard, I can do long stretches of back-and-forth mowing which helps when I'm exhausted from mowing (when I first took over mowing, I would end up with doing the north-west then north sections last, and that nearly killed me each time).
After mowing the front yard, I move the cars back into the driveway, instead of after mowing the entire lawn. It's easier to do it then than when I'm about to pass out.

I'm having a hard time seeing why Bunny would give up mowing the lawn. It's so much fun!

Monday, October 09, 2023

The Temptation

I'm still receiving emails for some other Sean Conner. This is about the fourth or fifth email I've received from XXXXXXXXXXXXXXXX, and I've already contacted them twice about the emails. They said they would remove my email. And yet:

From
XXXXXXXXXXXXXXXX <donotreply@XXXXXXXXX>

To
seanconner@gmail.com

Subject
Upcoming Appointment Reminder 10/11/2023 8:30am

Date
Tue, 10 Oct 2023 00:31:05 +0000

Hi XXXXXXXXX,

Your upcoming appointment with Dr XXXXXXXXXXXXXXXX at XXXXXXXXXXXXXXXX at XXXXXXXXXXXXXXXXXXXXXXXXXXXX Lake City, FL 32024 is on Wednesday, 10/11/2023 at 8:30am.

Please confirm your appointment so we can update our records.

Please take a minute to fill out or confirm your information through the online intake form:

https://XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

If you need to cancel or reschedule your appointment, please call us directly.

Thank you,

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXX
XXXXXXXXXXXXXX

It's to my email address, but not my name. And of course the reply goes to a blackhole email account.

At this point, I'm seriously considering if I should just call and cancel XXXXXXXXX's appointment and just savor the confusion and anger that will happen at 8:30am on Wednesday, with the hope that someone will figure out the email address is wrong. While the confusion and anger would happen, I'm beginning to seriously doubt they would track it down to the email address. Even Bunny, BUNNY! agrees that I should just call and cancel the appointment, especially after I've called them twice about this. I mean, I thought I was being a BOFH for thinking of doing this, but I would have never thought of Bunny as a BOFH.

Cool!

So why don't I just ignore this? Why go to the bother of notifying parties they have a wrong email address? Because I feel that notifying is the proper thing to do—these people are leaking private information to strangers, and apparently, they either don't know or care about it. If they don't care about, well, then it's on them, but if they don't know I'm sure that being notified would be helpful to them. That's why I do it.

But this … after contacting them twice? Yeah, I'm cancelling that appointment. And maybe as I keep cancelling them, they'll get a clue-by-four and fix the issue.

Or have me arrested.

I'm giving it 50/50 odds …

Update on Tuesday, October 10^th, 2023

I ended up not cancelling the appointment.

Tuesday, October 10, 2023

Get thee behind me, Satan

I wussed out.

I did not cancel the appointment.

Over the past few months of dealing with Bunny's medical issues, the wait for appointments was excruciating and in light of that, I just couldn't cancel someone else's appointment. Also, I thought about it, and it could have been a transcription error on the part of the doctor's office. My name can be spelled “Sean,” “Shawn,” “Shaun,” or “Shon.” My last name can be spelled “Conner” or “Connor.” That's eight combinations—more if you accept “Konner” or “Konnor” (it's not out of the realm of possibility—I had an uncle whose last name was “Kollins,” not “Collins”).

I did call and this time, it was escalated to the head nurse at the doctor's office. It turns out it was a transcription error, so there's at least one explanation for receiving some other Sean Conner's email (and for the record, the head nurse could neither confirm nor deny the patient's name was “Sean Conner” even though that was clearly the case).

Tuesday, October 17, 2023

“Tell us how you really feel about Agile …”

With God as my witness, the next son of a bitch to mention Agile is going to get hurled into the ground so hard that I'm going to publish a seismology paper in Nature with the data.

Via Flutterby, I Will XXXXXXX Haymaker You If You Mention Agile Again

I can relate. Not only did I have recurring meetings every day, but I had two recurring meetings every day! Also, if Ron Jeffries says to abandon Agile, you know it's a toxic word …

Discussions about this entry

Agile is Fine, Until You Look Back

Wednesday, October 18, 2023

When setting up to do the thing takes longer than doing the thing

While on route to the weekly lunch with some former cow-orkers, my car notified me that the front left tire was low on pressure. 'Tis not a problem, I thought, as we have an air compressor and tire attachment at Chez Boca.

Once back at Chez Boca, I began to set up to inflate the tires with said air compressor and tire attachment. The tire attachment, easy to obtain. The air compressor? Not so easy, as it was nestled in the middle of the garage among various wood working and gardening tools. I ended up having to hoist this 50 pound (23kg for those unwise in the ways of Imperial measurements) tool up and over some obstacles.

Then, power. The power cord on the air compressor is pretty short, and the nearest outlet that I could see was in the middle of the garage, nestled in the middle among various wood working and gardening tools (it's a long, horizontally mounted power strip along a table). All I had to do was find an extension cord.

And lo', next to the garbage can was a nice sized spool containing an extension cord. One end was visible, the other end, not so visible. In the end, I had to unspool the entire cord to find the other end—perhaps 50′ (15m for those of you not living in the U.S., Liberia nor Myanmar) spilled out across the floor.

Then a couple of minutes to get the power, air hose and tire attachment hooked up, and I was ready to inflate some tires. I think I spent about fifteen minutes total setting up.

Two minutes inflating tires.

Then another five minutes putting the tire attachment away, air hose coiled up, respool the extension cord, and hoisting the air compressor back into it's place in the garage.

Twenty minutes to do a two minute job.

Sigh.

Monday, October 23, 2023

One of the rarest gas stations in the United States

One of the YouTube channels I watch is Phil Edwards, and on October 10^th, I managed to catch his request for help—he wanted people local to a dozen cities to provide some phone video. I noticed that one of the cities mentioned was Lake Worth Beach, Florida. I once lived in Lake Worth, Florida—could it be near there?

Turns out, Lake Worth is Lake Worth Beach, having changed its name in 2019 in order to rebrand itself. And while Lake Worth Beach (and I don't think I'll ever get used to that name) is a bit north of Chez Boca, it isn't that far to drive. I felt it would be interesting to see what project Phil Edwards has in mind, so I signed up.

The project involved video of a gas station, and not just any gas station, but one of a dozen stations still branded with Standard Oil (which are now owned by Chevron Corporation).

On the 12^th, I drove to Lake Worth Beach and videoed the only Standard gas station in Florida. As stations go, it wasn't that special. It didn't have a distinctive architetural style, nor did it look all that old. It looked like every other Chevron gas station, except for the “Standard” label that most people probably just ignore. Heck, I lived just a few miles down the roat from this station and I never knew it was a Standard gas station, that's how unremarkable it is.

I'm only mentioning this now as Phil's video on Standard gas stations is now up on YouTube. The footage I supplied can be seen at the 2:35 mark (all four seconds of it).

So I guess that means I can add “videographer” to my résumé now.

Woot!

I'm getting some serious “Darmok and Jalad at Tanagra” vibes from this card magic book

“The book I ordered for you came in,” said Bunny, walking into the Computer Room and handing me a slim volume—Scott Kahn's Kahnjuring: Deceptive Practices With Playing Cards. She had ordered it about a week prior after we saw him on Penn & Teller's “Fool Us” (spoiler: he failed to fool them, but it's still a very cool card swap with transparent cards). “How soon until I see some tricks?”

I took the book and quickly read through the first trick. And … well …

When ready to perform, start with a convincing full deck false shuffle. Since this routine requires a table, I usually will use a Push Through Shuffle followed by an Up The Ladder False Cut. However, I have also used Bob King's variation of the Erdnase Blind Overhand Shuffle that was published in Darwin Oritz's The Annotated Erdnase, 1991. The spectator may even give the deck a straight cut, if desired, before proceeding.

Ribbon Spread the cards across the performing surface and ask the spectator to touch a card of their choosing. Outjog the selected card for half its length …

Kahnjuring, page 17

“Um … not any time soon,” I said. I just recently learned about the Push Through Shuffle, but the Up The Ladder False Cut? Erdnase Blind Overhand Shuffle?

I don't think this book is an introduction to card magic.

It's still neat, though, and the second trick in the book is the trick Scott Kahn did on “Fool Us,” so it's nice to learn how that particular trick is done, even if I didn't understand all the jargon.

Wednesday, October 25, 2023

A small warning about UDP based protocols

The Gemini protocol has inspried others to implement “simple” protocols, like Mercury (alternate link), Spartan (alternate link) and Nex (alternate link). But there's another protocol being designed that has me worried—Guppy (alternate link), which based on UDP instead of TCP.

Yes, UDP is simpler than TCP. Yes, you can get results with just one exchange of packets. But the downside of UDP is that you will be exploited for amplification attacks! I found this out the hard way a few years ago and shut down my UDP QOTD service. Any time you have a UDP-based protocol where a small packet to the server results in a large packet from the server will be exploited with a constant barrage of forged packets. That's one reason for the TCP three-way handshake.

Also, the Guppy protocol spec states, “it's an experiment in designing a protocol simpler than Gopher and Spartan, which provides a similar feature set but with faster transfer speeds (for small documents) and using a much simpler software stack,” but there's a downside—you can easily over-saturate a link with data, which is another reason UDP is popular for amplification attacks. Congestion control is one reason why TCP exists (some say it's the only reason and the other benefits, like a reliable, stream-oriented connection is a side effect of the design).

My intent here isn't to discourage experimentation. I like the fact that people are experiementing with this stuff. But I do want to pass along some painful experiences I had when playing around with UDP on the open Internet.

Discussions about this entry

Sunday, October 29, 2023

Inflation has seriously affected the Nigerian scams

Normally, I would ignore these Nigerian scam emails but this one—this one is gold:

From
Federal Emergency <info@un.com>

To
undisclosed-recipients:;

Subject
Attention Dear

Date
Wed, 25 Oct 2023 01:46:19 -0700

FEMA HEADQUARTERS

REGION II - NJ, NY, PR, VI 26 Federal Plaza, Room 1337 New York, NY
10278-0002
REGION III - DE, DC, MD, PA, VA, WV 615 Chestnut Street 6th Floor
Philadelphia, PA 19106

Congratulations to the owner of this email account, Please If you received this email in your spam folder it could be due to your Internet Service Provider, (ISP). move it to your inbox before responding, The Federal Emergency Management Agency (FEMA) has added an event wide notice for 4504DR-KS (4504DR) 2023, Event Wide Notice: FEMA Established Deadlines for COVID-19 Funding,

This is to intimate the owner of this email account with very important information, The United Nations and Federal Emergency Management Agency (FEMA) has made available a total sum of US$102,000,000,000,000 to world governments for distribution to Twelve Million successful company and business emails users. You're qualified and chosen for the (UNCC & FEMA) covid19 outbreak compensation payment of $8,500,000.00 USD only. Kindly contact the Secretary-general of the United Nations for the claim of more details. Contact Mr. Antonio Guterres through below email for the claim of your funds. $8,500,000.00,

CONTACT Mr. Antonio J. Guterres
EMAIL: <uncc2020office@gmail.com>
phone: +1-(409)-571-2111
kindly text him for an urgent response.

Body: FEMA established deadlines for the Public Assistance for COVID-19 events to assist states, tribal nations, localities, territories, and eligible private nonprofits in pandemic response and recovery. 30th December 2023 is the deadline for Applicants to submit their Request for Public Assistance for the COVID-19 pandemic incident. You can apply for reimbursement funding from FEMA for recovery efforts. 30th December 2023 is also the end of the 100% federal cost share for eligible COVID-19 response and recovery work. Any work that Applicants conduct or complete before 30th December, 2023 or after will be funded at 90% federal cost share,

yours faithfully, Miss. Deanne Criswell,

I think this is a record—one hundred trillion dollars in compensation. Yes, inflation is currently on the rise, but I don't think the United States has one hundred trillion dollars. Is the author perhaps confusing us with Zimbabwe? I'm happy to report that the math in the email works out, so some effort was given here. But aside from the decent grammar, the math working and the monetary amount, nothing else about this is really noteworthy.

Adventures in Utext

There is one point on the ASCII ↔︎ JS spectrum that I haven’t seen, and it’s one that, as I use Unicode in more complex ways on Gwern.net and have learned how many obscure features or characters Unicode has, I increasingly think has been neglected: only UTF-8 text rendered by a monospace font. Not ASCII, not a weird subset of SGML, not troff, not raw terminal codes, not bitmaps encoded in ASCII—just UTF-8. This document format does only what pure Unicode text can do—but does everything that pure Unicode can do, which turns out to be a lot. What if we take Unicode literally, but not seriously?

Your typical plain text output strips all formatting. At the most ambitious, it might have a Unicode superscript or fraction. But we can do so much more!

Utext: Rich Unicode Documents · Gwern.net

That was an interesting read (your mileage may vary).

To generate the gopher and Gemini versions of my blog, I parse the HTML and generate either plain text (for gopher) or Gemtext for Gemini. And I'm still not entirely happy with the output. For emphasized text, I would translate that to “*emphasized*”, which is … okay, I guess? And for ~~deleted~~ text—that was a harder to deal with, and I ended up with “[DELETED-deleted-DELETED]” text.

There's no excuse for that.

But after reading about Utext, and Uncode's COMBINING SHORT STROKE OVERLAY and COMBINING LOW LINE I thought I might try using those for some typographical niceties that you don't normally get with plain text. And that's when I learned that not all virtual terminals support all of Unicode all that well. And wraping text is … not that trivial anymore.

Ah well. For now, it seems to be working, but it remains to be seen if I like the results.

Update on Friday, December 8^th, 2023

I reverted this change due to issues.

A most persistent spam, part VIII

I received an email from Kevin thanking me for my post about Aleksandr and how he was able to stop the spam. But Keven had some issues with the proposed solution and how it didn't work directly for him. On close inspection of the post in question, I did find an issue with the regular expression presented in the post—it wasn't correct. The problem stemmed from my use of a bespoke markup language I created (that I should talk about at some point) where a character that should have been escaped, wasn't, causing the text to be misleading.

I fixed the issue and thanked Kevin for bringing the problem to my attention, even if it was in a roundabout way.

And to think, three years later and “Aleksandr” is still spamming people for some obscure reason.

Tuesday, October 31, 2023

“And in other news, water is wet, and the Pope is Catholic … ”

About Kevin's email …

I was hesitant to talk about the problem I had in replying, because it's well known that the Big Players (Google and Microsoft in particular) don't really care about smaller email servers, making it difficult to self-host email.

Yet … the bounce message I recieved contained the following:

Diagnostic-Code: X-Postfix; host
    hotmail-com.olc.protection.outlook.com[104.47.66.33] said: 550 5.7.1
    Unfortunately, messages from [71.19.142.20] weren't sent. Please contact
    your Internet service provider since part of their network is on our block
    list (S3140). You can also refer your provider to
    http://mail.live.com/mail/troubleshooting.aspx#errors.
    [MW2NAM12FT105.eop-nam12.prod.protection.outlook.com
    2023-10-29T06:59:01.371Z 08DBD7D7D4735644] (in reply to MAIL FROM command)

That link? Absolutely useless. To address the issue, I would have to sign in to my “Microsoft Account” or pay for services like this company that “ensures” email delivery (which I'm reading as “pay to play”). And of course I'm a company, because who would be so silly as to run their own email server? Sheesh!

Why Microsoft couldn't just send a link to their Office 365 Anti-Spam IP Delist Portal in the first place (which took entirely too long to find and didn't appear on the link they did send), I don't know—I guess that could make it too easy to “game” or something.

Mainly, I'm writing this for my future self to save some time when this happens again.

Update about an hour later …

The bounce came from outlook.com, but Kevin's email address is from hotmail.com, and it's hotmail.com that has a block on my IP, not outlook.com.

I found that out because I followed the instructions on the “Office 365 Anti-Spam Delist Portal” and it said “Oh! You aren't blocked! Try this link!” with “this link” asking me to log into my “Microsoft Account.”

Seriously, Microsoft? XXXX you.

Discussions about this entry

Lazy Reading for 2023/11/05 – DragonFly BSD Digest

The difference in penalties in AD&D1 and D&D5

Sunday was our gaming group “Hallowe'en One Shot” (which is now at least a “duo shot” as we didn't finish) and for some reason, I got to thinking about the penalty differences between AD&D and D&D5.

Everyone in our group started out playing AD&D (or the original Dungeons and Dragons) and in that system, if you are trying to hit something you can't see, you subtract 4 from your (20-sided) die roll (d20) when trying to hit it. But in D&D5, you roll two 20-sided dice (2d20) and take the lower value (called “disadvantage”). I was curious as to the actual difference between the two. I did a bit of programming and I got the following graph:

[A graph of AC (x-axis) and chance of hitting (y-axis) with various penalties, bonuses, and just plain hits]

Along the X-axis is AC. In AD&D goes from 10 (basically, nothing) to -10 (nigh impossible to hit) while in D&D5, it goes from 10 (basicaly nothing) to 30 (nigh impossible to hit), so the range is the same. So the X-axis is AC, going from 2 to 20. In both systems, rolling a 1 is an automatic miss, so I'm not bothering with even listing an AC of 1. The Y-axis is the probability of hitting said AC, from 1 (always a hit) to 0 (always amiss).

The red line (the one cutting diagonally across the middle) is just the result of rolling a d20 and is prety much what one would expect, a straight line. The light-green line (the lower diagonal line) is the AD&D penalty of subtracting four from a d20 (d20-4). Again, it's a straight line but giving a lower chance of hitting.

What I find fascinating is the blue line (the lower curved line). This is the D&D5 “disadvantage” roll. What's interesting about this is that at lower and higher ACs, it's better than a -4 penalty, but between ACs of 7 to 15, it's worse!

When I saw that, I just had to do the plot with a bonus. The purple line (the upper diagonal line) represents a d20+4, and the curved dark green line (the upper curved line) is the D&D5 “advantage” roll—where you roll 2d20 and take the higher. It's the opposite of “disadvantage”—you do worse at lower and higher ACs, but better between ACs of 7 and 15.

Wierd!

Discussions about this entry

Lazy Reading for 2023/11/05 – DragonFly BSD Digest

What happened to Hallowe'en?

It is now well past the time for any kids to be out trick-or-treating and we still haven't had one kid show up to Chez Boca to threaten us with tricks. Not one! Are all the kids now trick-or-treating at parking lots or malls? Is trick-or-treating passé now? Too many razor blades found in candy corn? Too much candy corn in general? What?

Sigh.

Well, all these M&Ms aren't going to eat themselves …

Thursday, November 09, 2023

The Day a Beatle Died

Growing up, I heard the rumors that Paul McCartney died in the 60s and was replaced with a look-alike to keep the ~~money rolling in~~ band together. But I did not realize that it was on this day in 1966 that he died. Supposedly. Maybe. If you believe.

Who knows? We can't ask John or George. Perhaps we can ask William Campbell. Or maybe Bill Shepherd. Or was it Billy Shears?

Wait! Maybe Ringo knows …

“A live, nine hundred foot what?”

And speaking of musical oddities …

This is a something that I think only two or three of my readers may appreciate (if that many)—a recording of MC 900 Ft. Jesus performing live at Good Records in Dallas, Texas. The sound quality could be better, but for someone who quit the industry in 2001, it was surprising to see him perform in 2017. And it's a shame he stopped recording, because his music is very unique—something like a cross between rap and jazz. Jazzrap? Rapjazz? Something like that.

Friday, November 17, 2023

Has it really been 45 years since this Star Wars Special that George Lucas disavowed aired for the first, and so far, only time?

Yikes! I'm getting old. I remember watching this when it first aired, and, except for the cartoon segment, didn't find it all that interesting. It certainly didn't help that it centered on Chewbacca's family (an hour or grunts and growls) and no subtitles what-so-ever. That's a very odd choice, but it's not like it's the only odd thing in the special—you have a Wookie getting off on soft-core porn (seriously!), Bea Arthur singing, Harvey Korman attempting comedy, Carie Fisher attempting to sing to the main Star Wars theme (who knew it has lyrics?) while wasted, Mark Hamill wearing makeup, and Harrison Ford wantint to be anywhere else than this special.

You would think this would be entertaining, but no, it's not. You can attempt to watch it, but I'd recommend watching an angry review of it (or any number of other reviews of it as it's more entertaining than the actual special itself.

Saturday, November 18, 2023

The Temptation, Part II

I did not wuss out this time.

Yes, yet more email to another Sean Conner that arrived in my Gmail account. This time, informing me I should change “my” password to “my” Instagram account.

Oh? Really?

And not just one, but multiple emails about this.

Okay, time for some tough love. It took a few attempts, but I finally changed “my” password to “my” Instagram account. Sorry the other Sean Conner, but not really. You should have known what your email address was. Is that so hard? (I know the answer—apparently yes, and I'm sill receiving emails about my non-existent child in Memphis)

WTF Instagram? Seriously, WTF?

So I'm trying to update the profile on “my Instragram account” when I see this: “Editing your links is only available on mobile. Visit the Instagram app and edit your profile to change the websites in your bio.”

Really?

I can edit my bio, and my gender, but I can't update a link? Because I'm on a desktop computer?

I figure, maybe I can hack the form to enable that field? I toggle the Web Developer Tools in Firefox and immedately see this:

 .d8888b.  888                       888    
d88P  Y88b 888                       888    
Y88b.      888                       888    This is a browser feature intended for 
 "Y888b.   888888  .d88b.  88888b.   888    developers. If someone told you to copy-paste 
    "Y88b. 888    d88""88b 888 "88b  888    something here to enable an Instagram 
      "888 888    888  888 888  888  Y8P    feature or "hack" someone's account, 
Y88b  d88P Y88b.  Y88..88P 888 d88P         it is a scam and will give them access 
 "Y8888P"   "Y888  "Y88P"  88888P"   888    to your Instagram account.
                           888              
                           888              
                           888              

See https://www.facebook.com/selfxss for more information.

Heh. But in the meantime …

… and nope. The link field has no name, and even if I remove the disabled attribute, it won't let me type in it (and as an aside, the “gender” field isn't even an HTML field element, but the word “Male” wrapped in four <DIV>s and a <SPAN>, each with fifteen classes attached to them—what the hell?). I'm so far removed from “modern web development” that I probably can't hack this without significant effort that I'm too lazy to do.

Sheesh.

Sunday, November 19, 2023

Ah, so that's the definition of a unit test

Originally, the term “unit” in “unit test” referred not to the system under test but to the test itself. This implies that the test can be executed as one unit and does not rely on other tests running upfront (see here and here).

Via The big TDD misunderstanding (2022) | Hacker News, The Big TDD Misunderstanding. 💡Originally, the term “unit” in “unit… | by Oliver Wolf | Mediu (also via Lobste.rs)

I think I finally have the answer to my question, “what is a ‘unit test?’” and … wow! And to think I was doing that all along at The Enterprise.

When I wrote the regression tests, I set each test up to be independent—each test got its own unique test data in the various “databases” we were using (we weren't really using a database, but custom binary data files based off a periodic database dump) and in theory, we could have run the tests in random order. In fact, during my last year there, I almost added that feature to the regression test, but by then, I was so burned out with The Process™ that I just never bothered. It's a shame that, because I think it would have been an interesting form of test to perform.

Friday, November 24, 2023

A Motorola 6809 assembler—there are many like it, but this is mine

I think it's time I start talking about some of the software I write, and I might as well start with my latest project that I've been having way too much fun writing, a 6809 assembler written in C.

Yes, I could use an existing 6809 assembler, but most of the ones availble as source seem to be based off one written in 1993 by L. C. Benschop. And the code quality there is … of its time … which I think is the most charitable thing I can say about it. Here's the code to convert text to a decimal number:

short scandecimal()
{
 char c;
 short t=0;
 c=*srcptr++;
 while(isdigit(c)) {   
  t=t*10+c-'0';
  c=*srcptr++;
 }
 srcptr--;
 return t;
}

Lots of globals, lots of “magic” numbers (at least they're described in comments), and vwl mprd variable names. It's not a pleasant code base to work in.

Besides, it's something I've been wanting to do since college. So why not?

So I have a standard two-pass assembler with a few features I haven't seen in other 6809 assemblers. And that's what I'll be describing here. The first feature is small, but decidedly nice—the ability to have underscores (“_”) in numberic literals. It's more useful for binary literals, such as %10_00_01_11 or %000_01001_0_100_0010 but it can be used for decimal, octal or hexadecimal numbers as well.

Another simple feature is the ability to generate a dependency list for make. Since I support the inclusion of multiple assembly files, it makes sense to support this feature as well. I'm not trying to make an assembler that works on the 6809 system (I think it's way too small a system for that), but an assembler that makes it nice to write code for a 6809 system.

I also have local labels that work similarly to NASM. As an example:

clear_bytes	clra
.loop		sta	,x+
		decb
		bne	.loop
		rts

clear_words	stb	,-s
		clra
		clrb
.loop		std	,x++
		dec	,s
		bne	.loop
		rts

Internally, the assembler will merge the local labels with the previous non-local label, and thus, we get the labels clear_bytes, clear_bytes.loop, clear_words and clear_words.loop. I find it makes for cleaner code. What is easier to understand, this?

;********************************************************************
;	Music Synthesizer
;Entry:	$3FF0	Freq delay count
;	$3FF1	Envelope table address
;	$3FF3	Envelope delay count
;	$3FF5	Volume, 1 to 255
; NOTE:	from _TRS_80 Color Computer Assembly Lanauge Programming_,
;	page 252
;********************************************************************

		org	$3F00

mussyn		lda	$FF01		; select sound out
		anda	#$F7		; reset MUX bit
		sta	$FF01
		lda	$FF03		; select sound out
		anda	#$F7		; reset MUX bit
		sta	$FF03
		lda	$FF23		; get PIA
		ora	#8		; set 6-bit sound enable
		sta	$FF23
		ldu	#$3FF0		; point to block
		ldx	1,u		; get envelope address
		stx	envptr		; save in envptr
		ldx	3,u		; get envelope delay
mus005		lda	[envptr]	; get value
		beq	mus090		; if 0, done
		ldb	5,u		; get volume
		mul			; adjust volume
		anda	#$FC		; reset RS-232-C (?)
		sta	$FF20		; set on
		ldb	,u		; get frequency delay count
mus010		leax	-1,x		; decrement envelope count
		bne	mus020		; go if not 0
		ldy	envptr		; increment evelope ptr
		leay	1,y
		sty	envptr
		ldx	3,u		; get envrolope delay
mus020		decb			; decrement frequency count
		bne	mus010		; go if not 0
		lda	[envptr]	; DUMMY
		brn	*+2		; DUMMY
		ldb	5,u		; DUMMY
		mul			; DUMMY
		clr	$FF20		; set off
		ldb	,u		; get frequency delay
mus030		leax	-1,x		; decrement envelope count
		bne	mus040		; go if not 0
		ldy	envptr		; increment envelope ptr
		leay	1,y
		sty	envptr
		ldx	3,u		; get envelope delay
mus040		decb			; decrement frequency count
		bne	mus030		; go if not 0
		bra	mus005		; keep on playing
mus090		rts
envptr		fdb	0

		end	mussyn

Or this?

;********************************************************************
;	Music Synthesizer
;Entry:	$3FF0	Freq delay count
;	$3FF1	Envelope table address
;	$3FF3	Envelope delay count
;	$3FF5	Volume, 1 to 255
; NOTE:	from _TRS_80 Color Computer Assembly Lanauge Programming_,
;	page 252
;********************************************************************

		org	$3F00

mussyn		lda	$FF01		; select sound out
		anda	#$F7		; reset MUX bit
		sta	$FF01
		lda	$FF03		; select sound out
		anda	#$F7		; reset MUX bit
		sta	$FF03
		lda	$FF23		; get PIA
		ora	#8		; set 6-bit sound enable
		sta	$FF23
		ldu	#$3FF0		; point to block
		ldx	1,u		; get envelope address
		stx	.envptr		; save in envptr
		ldx	3,u		; get envelope delay
.next_byte	lda	[.envptr]	; get value
		beq	.exit		; if 0, done
		ldb	5,u		; get volume
		mul			; adjust volume
		anda	#$FC		; reset RS-232-C (?)
		sta	$FF20		; set on
		ldb	,u		; get frequency delay count
.sound_on	leax	-1,x		; decrement envelope count
		bne	.check_freq_on	; go if not 0
		ldy	.envptr		; increment evelope ptr
		leay	1,y
		sty	.envptr
		ldx	3,u		; get envrolope delay
.check_freq_on	decb			; decrement frequency count
		bne	.sound_on	; go if not 0
		lda	[.envptr]	; DUMMY
		brn	*+2		; DUMMY
		ldb	5,u		; DUMMY
		mul			; DUMMY
		clr	$FF20		; set off
		ldb	,u		; get frequency delay
.sound_off	leax	-1,x		; decrement envelope count
		bne	.check_freq_off	; go if not 0
		ldy	.envptr		; increment envelope ptr
		leay	1,y
		sty	.envptr
		ldx	3,u		; get envelope delay
.check_freq_off	decb			; decrement frequency count
		bne	.sound_off	; go if not 0
		bra	.next_byte	; keep on playing
.exit		rts
.envptr		fdb	0

		end	mussyn

It helps that I allow 63 characters for a label, which is way more than any 6809 assembler I've ever used.

The last feature I have are warnings. Given the following code:

.start		lda	<<b16,x
		ldb	#$FF12
		std	foobar
		lda	b5,u
		ldb	b8,s
		tfr	a,x
		lbsr	a_really_long_label_that_exceeds_the_internal_limit_its_quite_long

		sta	[<<b5,y]
		bra	another_long_label_that_is_good

a_really_long_label_that_exceeds_the_internal_limit_its_quite_long
		rts

another_long_label_that_is_good
		clra
.but_this_makes_it_too_long_to_use
		decb
		bne	.but_this_makes_it_too_long_to_use

		bra	next8
next8		lbra	next1
next16		brn	next8b
next8b		lbrn	next16b
next16b		rts

foobar		equ	$20
b16		equ	$8080
b5		equ	3
b8		equ	25

The assembler will generate the following warnings (yes, this code is used to test all the warnings in the assembler):

warn.asm:1: warning: W0010: missing initial label
warn.asm:6: warning: W0008: ext/tfr mixed sized registers
warn.asm:7: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters
warn.asm:12: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters
warn.asm:17: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters
warn.asm:19: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters
warn.asm:1: warning: W0003: 16-bit value truncated to 5 bits
warn.asm:2: warning: W0004: 16-bit value truncated to 8 bits
warn.asm:3: warning: W0005: address could be 8-bits, maybe use '<'?
warn.asm:4: warning: W0006: offset could be 5-bits, maybe use '<<'?
warn.asm:5: warning: W0007: offset could be 8-bits, maybe use '<'?
warn.asm:7: warning: W0009: offset could be 8-bits, maybe use short branch?
warn.asm:9: warning: W0011: 5-bit offset upped to 8 bits for indirect mode
warn.asm:21: warning: W0012: branch to next location, maybe remove?
warn.asm:22: warning: W0012: branch to next location, maybe remove?
warn.asm:1: warning: W0002: symbol '.start' defined but not used

So, in order of appearance:

W0010: What happens if you give a local label sans a non-lobal label? Well, I decided to allow it, but at least warn about it. The result label is just .start but it could be hard to reference. I could see making this an error, but for now, it's just a warning.
W0008: This is the only warning about undefined behavior. The 6809 doesn't specify what happens when you transfer (or exchange) an 8-bit register with a 16-bit register (or vice versa). The CPU just keeps running, but the results are just that—undefined. Again, this could be an error, but for now, I'm letting it slide as a warning.
W0001: Internally, the assembler just truncates labels to 63 characters, but otherwise, it just keeps going.
W0003: This is related to the nature of a two-pass assembler and forward references. Here, I'm forcing the given index to a 5-bit index (which doesn't take an additional byte of space, unlike an 8-bit (one additional byte) or a 16-bit (two additional bytes) offset), but the assembler has to assume it's okay on pass one. By the time pass two comes around, b16 is defined but it's value exceeds that 5-bits (which is -16 to 15 for the record). This warning is just letting the user know the value doesn't fit into 5-bits.
W0004: Pretty much the same as W0003 except for an 8-bit value.
W0005: Again, due to the nature of a two-pass assembler. This time, no hint is given to the size of the label, and on pass one, the assembler assumes the worst—a 16 bit value. It's only on pass two does it have enough information to know it could be an 8-bit address, but it can't use an 8-bit address as it would throw all the other addresses off (ask me how I know).
W0006: Similar to W0005, but for an offset that can fit in 5-bits.
W0007: Similar to W0006 but for an 8-bit value.
W0009: This time, the assembler has determined that the target instruction falls within an 8-bit relative branch instruction, but was given a 16-bit relative branch instruction. This can happen because of code refactorings that shrinks the distance between the branch instruction and the target.
W0011: One of the features of the 6809 is its support of indirect indexing. Instead of the index having the data directly, the index contains the address of the data (in C parlance, LDA ,X is A = *X and LDA [,X] is A = **X). The 6809 doesn't support this mode for 5-bit offsets, but it does for 8-bit and 16-bit offsets. This is just a warning that you can't use a 5-bit offset for this. I'm on the fence about keeping or removing this, and I'm keeping it for now.
W0012: This detects when you branch to the following instruction, except if the instruction is BRN which is “branch never” (or the long branch version LBRN). The 6809 is unique for an 8-bit CPU with such an instruction. And despite it's apparent uselessness (why would you have a branch that is never taken) it is useful to pad out timing loops when talking to hardware.
W0002: The label wasn't referenced by any other code. And if the label is not referenced, why have the label in the first place? It could also mean an unused variable whose removal could save some space.

As you can see, most of the warnings are about code sequences that could be shorter, and I'm not aware of any assembler that gives such warnings. I could be wrong, but of the 6809 assmemblers I've used, I haven't seen anything like this.

I also have a way to supress a given warning (they're all enabled by default—I'm opinionated about this, and your stuck with my opinion if you want to use this assembler).

So that's it about the unique features I have in my assembler. I don't expect many people to use this, but I don't care, I'm having fun developing it. And that's what counts.

Monday, November 27, 2023

Unit testing on an 8-bit CPU

I've been using my assembler to write silly little programs for the Color Computer (simulated—easier than setting up my Color Computer, the first computer I owned as a kid). It took me entirely too long to locate a bug in a maze-drawing program I've been writing, and I want to do a deep dive on this.

The program is simple, it just draws a maze, following a few simple rules:

pick a random direction (up, right, left or down);
attempt to draw a maze segment (using color 1) in the given direction;
if we're boxed in (no direction to go in)—
1. if we're at the starting location, we're done;
2. backtrack along a segment of the path we've drawn (using color 2);
3. if we are no longer boxed in, go back to step 1;
4. otherwise, go back to step 3.

Nothing hard about it, yet the program kept getting stuck.

It starts in the upper left, meanders a bit (in blue), backtracks a bit (in red) until it gets stuck. I would then stare at the code until blood formed on my brow, simplify the code if I could, and try again only to watch it get stuck, again.

The issue is that I have no debugger on this system. I don't have two screens upon which to show debugging output. I have no way of single-stepping though the code. I don't even have a log to go through. Debugging consists of running the code, then thinking real hard. And yes, it was a stupid mistake that took all of two lines of code to fix.

Now, the question I want to ask is—would I have saved time if I did “unit testing?” Not just testing, which I was doing all along, but the typical style of “unit testing” where you test a “unit” of code (getting back to that question again). Looking over the code (and that version has the bug—see if you can find it; you'll also need these two files), the most isolated “unit” is random (the last subroutine in the code).

;***********************************************************************
;	RANDOM		Generate a random number
;Entry:	none
;Exit:	B - random number (1 - 255)
;***********************************************************************

random		ldb	lfsr
		andb	#1
		negb
		andb	#$B4
		stb	,-s		; lsb = -(lfsr & 1) & taps
		ldb	lfsr
		lsrb			; lfsr >>= 1
		eorb	,s+		; lfsr ^=  lsb
		stb	lfsr
		rts

It implements a linear-feedback shift register with a period of 255 (in that it will return each value in the range of 1‥255 once in some randomesque order before repeating, which is Good Enough™ for this code). So what if we want to test that the code is correct?

And it's here I came to a realization—“unit testing” really depends upon the language and tooling around it. Modern orthodoxy holds “unit testing über alles” and therefore, modern languages and tooling is created to support “unit testing über alles” (and woe betide those who question it). I think my struggle with “unit testing” is that the environments I find myself in don't lend themselves very well to “unit testing.” Even when I worked at The Enterprise, we were using C (C99 at best) and C++ (C++98, maybe C++03?) which take a lot of work upfront to support “unit testing” well, and there wasn't a lot of work upfront to support “unit testing” well, and there was a decidely lack of a “unit testing” framework. And here, there's definitely not a “unit testing” framework. Any “unit testing” I have to do involves writing yet more code. So let's write yet more code and test this sucker.

CHROUT		equ	$A002		; BASIC character output routine
lfsr		equ	$F6		; unused location in direct page

		org	$4000

start		ldx	#result_array	; point to memory
		clra			; storing 0 to memory
		clrb			; 256 bytes to clear

.clrarray	sta	,x+		; clear memory
		decb			; decrement count
		bne	.clrarray	; keep going until count = 0

		ldx	#result_array	; point to array
		lda	#255		; cycle length

checkrnd	bsr	random		; get random number in B
		tst	b,x		; have we seen this number?
		bne	.failed		; if so, we have failed
		inc	b,x		; set flag for this number
		deca			; are we done with the cycle
		bne	checkrnd	; if not, keep going

		ldx	#msg.success	; SUCCESS!
		bsr	puts
		rts

	;---------------------------------------------------
	; Store count (register A) and random # (register B) 
	; so we can use PEEK in BASIC to see where we failed
	;---------------------------------------------------

.failed		std	lfsr+1
		ldx	#msg.failed	; failed message
		bsr	puts		; display it
		rts			; return to BASIC

puts.10		jsr	[CHROUT]	; print character
puts		lda	,x+		; get character
		bne	puts.10		; if not NUL, print it
		rts			; return to BASIC

;******************************************************************

random		ldb	lfsr
		andb	#1
		negb
		andb	#$B4
		stb	,-s		; lsb = -(lfsr & 1) & taps
		ldb	lfsr
		lsrb			; lfsr >>= 1
		eorb	,s+		; lfsr ^=  lsb
		stb	lfsr
		rts

;*******************************************************************

msg.success	asciiz	'SUCCESS!\r'
msg.failed	asciiz	'FAILED!\r'
result_array	equ	*
		end	start

The tooling here doesn't support linking 6809 code, and I'd rather not have to keep each routine in its own file since the program is so simple and makes editing it easier if everything is in one file (no IDE here—and yes, I have thoughts about testing and IDEs but that's beyond the scope for this post). So I have to copy the routine to the test program.

This was something I kept trying to tell my manager at The Enterprise—the test program itself might be buggy (he personally treated the output as gospel—sigh). And the “unit testing” proponents seem to hem and haw about testing the testing code, implying that one does not simply test the testing code. But if programmers aren't trusted to write code and must test, then why are they trusted to write testing code without testing?

It might seem I digress, but I'm not. There are four bugs in the above test. The code I'm testing, random? It was fine. And I wasn't intending to write a buggy test harness, it just happened as I was writing it. Bug one—I forgot to test that random didn't return 0 (that's a degenerate case with LFSRs). Second bug—I forgot to ininitialize the LFSR state with a non-zero value, so random would return nothing but zero. The third bug was thinking I had used the wrong condition when branching to the failure case, but no, I had it right the first time (the fact that I changed it, and then changed it back, is the bug). The actual bug that caused this was the fourth bug, but I have to digress a bit to explain it.

The 6809 has an extensive indexing addressing mode for an 8-bit CPU. One of the modes allow one to use an accumulator register (A, B or D) as an offset to the index register. I used the B register, which contains the random number, as an offset into a 256-element array to track the return values, thus the use of b,x in the code above. What I forgot about in the moment of writing the code is that the “accumulator,index-register” indexing mode sign extends the accumulator. And the first value from random is, due to the LFSR I'm using, if treated as signed, a negative value—it would fail on the very first attempt.

Sigh.

This is why I panicked and thought I botched the conditional branch.

Now, all of that just to test the most isolated of subroutines in the program.

But had I continued, would any form of “unit testing” been beneficial? There's the subroutine point_addr—which converts an X,Y position into a byte address in the frame buffer, and the pixel in said byte. I could have done an exhaustive test of all 4,096 points, again, that's code I would have write (in 6809 Assembly code) and unfortunately, test, to have any confidence in it. And working up the chain, there's getpixel and setpixel. Testing those would require a bit of thought—let's see … getpixel returns the color of the pixel at the given X,Y location on the screen. and assuming point_addr is working, it would only take four tests (one per pixel in the byte) but at this point, would I even trust myself to write the test code?

In fact, would “unit testing” have saved me any time?

Given that I would have to write the testing framework, no, I don't think I would have saved time. Perhaps if I thought the issue through before diving into changing the code, I would have solved this earlier.

And the clues were there. I did discover pretty early on that the bug was in the backtracking code. The top level code is pretty easy to visually inspect:

backtrack	lda	#BACKTRACK
		sta	color

.loop		ldd	xpos		; check to see if we're back
		cmpd	xstart		; at the starting point,
		beq	done		; and if so, we're done

		ldd	xpos		; can we backtrack NORTH?
		decb
		lbsr	getpixel
		cmpb	#EXPLORE
		bne	.check_east
		lbsr	move_north.now	; if so, move NORTH and see if
		bra	.probe		; we have to keep backtracking

.check_east	ldd	xpos		; east ...
		inca
		lbsr	getpixel
		cmpb	#EXPLORE
		bne	.check_west
		lbsr	move_east.now
		bra	.probe

.check_west	ldd	xpos		; yada yada ...
		deca
		lbsr	getpixel
		cmpb	#EXPLORE
		bne	.check_south
		lbsr	move_west.now
		bra	.probe

.check_south	ldd	xpos
		incb
		lbsr	getpixel
		cmpb	#EXPLORE
		bne	.probe
		lbsr	move_south.now

.probe		bsr	boxed_in	; can we stop backtracking?
		bne	explore		; if so, go back to exploring
		bra	.loop		; else backtrack some more

The thing to keep in mind here is that the D register is a 16-bit register where the upper 8-bits is the A register, and the lower 8-bits are the B register, and that the 6809 is big-endian. So when we do ldd xpos we are loading the A register with the X coordinate, and B with the Y coordinate. And the move_*.now subroutines work, or else we wouldn't get the starting maze segments at all. So it's clear that setpixel works fine.

The code is getting stuck trying to backtrack along already drawn segments, and it does that by calling getpixel, therefore, it seems prudent to check getpixel. And sure enough, that is where the bug resides.

;*************************************************************************
;	GETPIXEL	Get the color of a given pixel
;Entry:	A - x pos
;	B - y pos
;Exit:	X - video address
;	A - 0
;	B - color
;*************************************************************************

getpixel	bsr	point_addr	; get video address
		comb			; reverse mask (since we're reading
		stb	,-s		; the screen, not writing it)
		ldb	,x		; get video data
		andb	,s+		; mask off the pixel
.rotate		lsrb			; shift color bits
		deca
		bne	.rotate
.done		rts			; return color in B

;*************************************************************************
;	POINT_ADDR		calculate the address of a pixel
;Entry:	A - xpos
;	B - ypos
;Exit:	X - video address
;	A - shift value
;	B - mask
;*************************************************************************


point_addr.bits	fcb	%00111111,%11001111,%11110011,%11111100 ; masks
		fcb	6,4,2,0	; bit shift counts

I've included a bit of point_addr to give some context. point_addr returns the number of shifts required to move the color value into place, and one of those shift values is 0. But getpixel doesn't check to see it's 0 before decrementing it. And thus, getpixel will return the wrong value for the last pixel in any byte. The fix is simple:

		tsta
		beq	.done

just before the .rotate label fixes the bug (two instructions, three bytes and here's a link to the fixed code).

Yes, I freely admit that a “unit test“ of this subroutine would have shown the bug. But how much time would I have spent writing the test code to begin with? The only reason it took me as long as it did to find was because the reference code I was using was quite convoluted, and I spent time simplifying the code as I went along (which is worthy of doing anyway).

What I wish the “unit testing” proponents would realize is that easy testing depends upon the language and tooling involved in the project, and what a “unit” is truly depends upon the language. I suspect that “unit test” proponents also find “unit testing” easier to deal with than “integration testing” or even “end-to-end testing,” thus why we get “unit tests über alles” shouted from the roof tops.

Discussions about this entry

A hard DNS problem

While I'm in a testing mood, I came across this post:

The job was for a position involving the day-to-day operation of DNS servers and firewalls involving DNS and client systems involving DHCP and DNS systems, and was a senior role so the ideal candidate would probably be able to say a little more than "magic" as to how DNS works.

…

A hard DNS problem: the owner of the zonefile added a new record that you requested. The serial number of the zone did increment (and there is no funny invalid wrap-around of the serial number going on (bonus points for knowing that)). The new record is not visible. What went wrong? This is probably too hard a question for a job interview, though you might explain all this and then ask how they would go about debugging the problem.

Magic

The first paragraph sets up the context, and at the end, presents a DNS problem. I worked with DNS before, and this doesn't seem that hard a question.

So, I've noticed an issue with a record I wanted added to a zone file, say a TXT RR for foo.example.com that reads “I have a red pencil.” I'm also assuming I've done a check from an outside network and didn't see the record, and that looking up the SOA RR (also from an outside network) showed the new serial number. My first step would be to query the authoritative name servers (typically two to four, could be more) and see if the record is there. If the record does show up, then it's a propagation issue, maybe related to caching or TTL issues. If not, then my guess is that the owner of the zonefile messed up adding the record, possibly by adding a spurious “.” to the end of the domain name—the owner effectively added

foo.	IN	TXT	"I have a red pencil."

This might not even show up as an error in any log files, but this is as-if the record was never added. But if the record was indeed added correctly, as in:

foo	IN	TXT	"I have a red pencil."

Then the next place to look is at how the change was made—did the owner update the record in some editor that then failed to properly update DNS? That would be the next place to look, and the results from that should indicate where to look next.

Why yes, I have done my fair share of DNS troubleshooting … why do you ask?

Update on Tuesday, November 28^th, 2023

Some more information came back.

Tuesday, November 28, 2023

Still a hard DNS problem

The zone file is entirely correct as far as syntax goes and was updated with the new record without error. The new record does not appear in queries about it, but does appear in the new zone file even on the secondary servers.

Re: A hard DNS problem

Ah, there's more information about the problem. I did mention that “[i]f the record does show up, then it's a propagation issue, maybe related to caching or TTL issues.” But to be fair, there could be a few other issues. I don't think it's an issue of the zone file was updated but the DNS servers weren't restarted—I don't get that from the wording, and there's a quick test for that anyway—check the serial number by requesting the SOA RR.

Another issue to check is what the root DNS servers think the authoritative DNS servers for the zone are. A quick check of whois could provide that information, or even a query outside the network for the NS RR for the domain. If they don't match the expected list of DNS servers, then either the domain expired, was transfered, or someone else in the organization updated the NS records for the domain.

But if the NS RRs are correct, and I can see the proper serial number from an SOA query from outside the network, but not the new record … I don't know. I might try to use a few different locations outside the network to do queries from, just to make sure it's not the DNS server I'm using for queries, but if they all exhibit the behavior … I doubt it'll be an unsupported RR type, perhaps something to do with DNSSEC? Which is beyond my paygrade …

I would like to know the actual issue is—I can see it either being something very trivial and I'll kick myself not not seeing, or it's something that I've not had experience dealing with at all.

Wednesday, November 29, 2023

Yes, that is a hard DNS problem

… Instead of starting with a new zone and copying over some necessary entries from the old zone (what I would have done), someone had simply(?) aliased the new zone over to the old one. Then, when it eventually became necessary to change the new zone (these things take time, and memories can become lost, like rings at the bottom of a river) the records would not take as the whole zone was still aliased to the old one.

I cannot reproduce the (reported) issue with nsd, as nsd fails the zone with a "DNAME at foo.example.org. has data below it" error. However, they were not using nsd; probably their name server allowed a mix of DNAME and thus shadowed-by-the-alias records …

The Long Tail of DNS Record Types

As I last wrote, “I can see it either being something very trivial and I'll kick myself for not seeing, or it's something that I've not had experience dealing with at all,” and it does appear to be something I've not had experience dealing with at all.

The DNAME RR is to delegate name resolution to another server, mainly for address-to-name mappings, but also for aliases. I recall doing a form of name delegation using a non-kosher method back in the late 1990s and early 2000s (back when I was wearing a “sysadmin” hat) involving NS RRs but not with DNAME. DNAME didn't exist when I started with delegations, thus, no experience with it.

And yes, that would be a hard DNS problem if you never encountered it before.

Unit testing from inside an assembler

Plug plug: I've written an assembler[0] for the 6502 (with full LSP and debugging support). It also supports the concept of unit tests whereby your program gets assembled and every test individually gets assembled and run, whereby you can add certain asserts to check for CPU register states and things like that.

[0] See https://mos.datatra.sh/guide/unit-testing.html

Plug plug: I've written an assembler[0] for the 6502 (with full LSP and debuggin... | Hacker News

This comment (from the Orange Site about a previous post) grabbed my attention. I'm fascinated by the feature, and I think that's because the test is run in the assembler! (As a side note—I think they missed an opportunity by not using TRON to enable tracing) I'm thinking I might try to add a feature to my my assembler, as I've already written a 6809 emulator as a library.

If I already had this feature (and riffing off the sample), how might this look? What are some of the issues that might come up? I marked up the random function as I might have done during testing:

;***********************************************************************
;	RANDOM		Generate a random number
;Entry:	none
;Exit:	B - random number (1 - 255)
;***********************************************************************

random		ldb	lfsr
		andb	#1
		negb
		andb	#$B4
		stb	,-s		; lsb = -(lfsr & 1) & taps
		ldb	lfsr
		lsrb			; lfsr >>= 1
		eorb	,s+		; lfsr ^=  lsb
		stb	lfsr
		rts

	; --------------------

	.test	"random"
	.tron
		ldx	#.result_array + 128
	.troff
		lda	#1
		sta	lfsr
		lda	#255
.loop		bsr	random
	.assert	cpu.B <> 0 , "degenerate LFSR"
	.tron
		tst	b,x
	.troff
	.asert	cpu.CC.z <> 1
		inc	b,x
		deca
		bne	.loop
		rts
.result_array	rmb	256

	.endtest

First off, I would have the tracing always print results—that way I can follow the flow to help see the issue. One open question—would that be a command line option? Or as I have it here—a pseudo operation? Second, how would I return from the code? The sample I'm going off uses BRK (the 6502 software interrrupt instruction). I suppose I could use SWI but I would also want to fill unused memory with that instruction in case the code goes off into the weeds, so I would need a way to detect the difference. I don't want to juse use .endtest to end the code sequence, as I might also want to include variables, like I did here.

Another example, this time the function that had the bug in it:

;*************************************************************************
;	GETPIXEL	Get the color of a given pixel
;Entry:	A - x pos
;	B - y pos
;Exit:	X - video address
;	A - 0
;	B - color
;*************************************************************************

getpixel	bsr	point_addr	; get video address
	.tron
		comb			; reverse mask (since we're reading
		stb	,-s		; the screen, not writing it)
		ldb	,x		; get video data
		andb	,s+		; mask off the pixel
		tsta			; any shift?
		beq	.done
.rotate		lsrb			; shift color bits
		deca
		bne	.rotate
	.troff
.done		rts			; return color in B

	.test	"getpixel"
		ldd	#.screen
		std	ECB.beggrp
		lda	#0		; X
		lda	#0		; Y
		bsr	getpixel
	.assert cpu.X = #.screen
	.assert	cpu.B = 3
		lda	#1
		ldb	#0
		bsr	getpixel
	.assert cpu.X = #.screen
	.assert	cpu.B = 3
		lda	#2
		ldb	#0
		bsr	getpixel
	.assert	cpu.X = #.screen
	.assert	cpu.B = 3
		lda	#3
		ldb	#0
		bsr	getpixel
	.assert cpu.X = #.screen
	.assert	cpu.B = 3
		rts
.screen		fcb	%11_11_11_11	; our four pixels
	.endtest

More questions: should I be able to trace non-test code? Probably, as that could help with debugging issues. Also, the function being tested is calling another function which just happens to be a forward reference, which tells me that calling the tests should happen on pass two of the assembler. And that brings up further questions—what about code like this?

INTCNV		equ	$B3ED
GIVABF		equ	$B4F4

		org	$7000
checksum	jsr	INTCNV		; get parameter from BASIC
		tfr	d,y		; it should point to a string variable
		ldx	2,y		; get address
		lda	,y		; get length
		clrb			; clear checksum and Carry bit		
.sum		adcb	,x+		; add
		deca
		bne	.sum
		comb			; 1s compliment
		clra			; return 0-255 result
		jmp	GIVABF		; return result to BASIC

	.test	"checksum"
		ldd	#.tmpstr	; our "string"
		jsr	GIVABF		; give address to BASIC
		bsr	checksum
		jsr	INTCNV		; get our result from BASIC
	.assert	cpu.D = 139		; if I did my math right
		rts

.tmpstr		fcb	5
		fcb	0
		fdb	.text
		fcb	0
.text		fcc	/HELLO/
	.endtest

The two routines INTCNV and GIVABF are ROM routines (from the Color Computer BASIC system) so we don't have the code for the emulator, and therefore, this code can't be tested as is. I suppose it could be rewritten such that it can be tested (and use more memory, which could be an issue) but this does show the limitation of this technique.

I suppose one fix would be conditional assembly:

	.iftest
.value		fdb	0
INTCNV		ldd	.value
		rts
GIVABF		std	INTCNV.value
		rts
	.else
INVCNV		equ	$B3ED
GIVABF		equ	$B4F4
	.endif

but personally, I'm not a fan of conditional code, but I shouldn't discount this as a solution.

Another issue is labels. I've been using local labels for the testing code, thinking that there would be a unique non-local label for each test (generated by the assembler) to avoid naming conflicts (naming is hard). I need to think on how I want to handle this.

It's an interesting idea though …

Friday, Debtember 01, 2023

Unit testing from inside an assembler, part II

I started working on unit tests from inside the assembler. I'm not sure how MOS does it (as I don't read Rust) so I'm making this up as I go along. I'm using the following file as a test case for the work:

lfsr		equ	$F6

		org	$4000
start		bsr	random
		rts

the.byte	fcb	$55
the.word	fdb	$AAAA

;***********************************************
;	RANDOM		Generate a random number
;Entry:	none
;Exit:	B - random number (1 - 255)
;***********************************************

random		ldb	lfsr
		andb	#1
		negb
		andb	#$B4
		stb	,-s
		ldb	lfsr
		lsrb
		eorb	,s+
		stb	lfsr
		rts

	.test	"random"
		ldx	#.result_array
		clra
		clrb
.setmem		sta	,x+
		decb
		bne	.setmem
		ldx	#.result_array + 128
		lda	#1
		sta	lfsr
		lda	#255
	.tron
.loop		bsr	random
	.assert	/B <> 0 , "degenerate LFSR"
		tst	b,x
	.assert	/CC.z <> 1 , "non-repeating"
	.troff
		inc	b,x
		deca
		bne	.loop
	.assert @the.byte == $55 && @@the.word == $AAAA , "tis a silly test"
		rts
.result_array	rmb	256

	.endtst

		nop

;***********************************************

		end	start

I've made the “unit test” … thing, a backend (like I have for binary and Color Computer-specific output as backends) because it's less intrusive on the code and I wasn't sure where to assemble the test code (within the memory space of the 6809). By making this a specific backend, it should be apparent that this is not for the final version of the code.

So far, I have it such that all the non-test backends don't see the code at all:

                         | FILE test.asm
                       1 | 
                       2 | lfsr            equ     $F6
                       3 | 
                       4 |                 org     $4000
4000: 8D    04         5 | start           bsr     random
4002: 39               6 |                 rts
                       7 | 
4003: 55               8 | the.byte        fcb     $55
4004: AAAA             9 | the.word        fdb     $AAAA
                      10 | 
                      11 | ;***********************************************
                      12 | ;       RANDOM          Generate a random number
                      13 | ;Entry: none
                      14 | ;Exit:  B - random number (1 - 255)
                      15 | ;***********************************************
                      16 | 
4006: D6    F6        17 | random          ldb     lfsr
4008: C4    01        18 |                 andb    #1
400A: 50              19 |                 negb
400B: C4    B4        20 |                 andb    #$B4
400D: E7    E2        21 |                 stb     ,-s
400F: D6    F6        22 |                 ldb     lfsr
4011: 54              23 |                 lsrb
4012: E8    E0        24 |                 eorb    ,s+
4014: D7    F6        25 |                 stb     lfsr
4016: 39              26 |                 rts
                      27 | 
                      28 |         .test   "random"
                      29 |                 ldx     #.result_array
                      30 |                 clra
                      31 |                 clrb
                      32 | .setmem         sta     ,x+
                      33 |                 decb
                      34 |                 bne     .setmem
                      35 |                 ldx     #.result_array + 128
                      36 |                 lda     #1
                      37 |                 sta     lfsr
                      38 |                 lda     #255
                      39 |         .tron
                      40 | .loop           bsr     random
                      41 |         .assert /B <> 0 , "degenerate LFSR"
                      42 |                 tst     b,x
                      43 |         .assert /CC.z <> 1 , "non-repeating"
                      44 |         .troff
                      45 |                 inc     b,x
                      46 |                 deca
                      47 |                 bne     .loop
                      48 |         .assert @the.byte == $55 && @@the.word == $AAAA , "tis a silly test"
                      49 |                 rts
                      50 | .result_array   rmb     256
                      51 | 
                      52 |         .endtst
                      52 |         .endtst
                      53 | 
4017: 12              54 |                 nop
                      55 | 
                      56 | ;***********************************************
                      57 | 
                      58 |                 end     start

    2 | equate      00F6     3 lfsr
   17 | address     4006     1 random
    5 | address     4000     1 start

Ignore that line 52 shows up twice here—that's a bug that I'll work on (my initial fix removed the duplicate line, but line 51 didn't show up—it's not a show-stopping bug which I why it's going on the “fix it later” list). Also, the labels the.byte and the.word don't show up on the symbol list at the end due to a “feature” where labels that aren't referenced aren't printed (that was to remove unused equates from the symbol list). So for the non-test backends, the actual testcase isn't part of the build.

The other added directives, like .tron, .troff and .assert are also ignored by the other backends if the directives appear outside a “unit test.”

With the .test backend though, all the directives are recognized and most of them work, although I'm still working on .assert (see below).

One issue—when to run the actual tests. Right now, the code is run when then .endtst directive is hit, as running the code as it's assembled won't work well I think, especially with branches and calls to other routines, and it would be a nightmare to get correct. It's easier if all the code exists in “memory,” but one issue I've noticed is that any code further down in the file can't be used. I'll have to move the execution of tests to after the assembly pass is done.

The .tron and .troff directives work, dumping out the instructions between them as the code is run:

... lots of lines cut
PC=402A X=40B4 Y=0000 U=0000 S=7FFE DP=00 A=09 B=D2 CC=-f-i---c | 402A 8D   DA     - BSR   4006               ; ----- backwards 
PC=402C X=40B4 Y=0000 U=0000 S=7FFE DP=00 A=09 B=69 CC=-f-i---- | 402C 6D   85     - TST   B,X                ; -aa0- 411D = 00
PC=402A X=40B4 Y=0000 U=0000 S=7FFE DP=00 A=08 B=69 CC=-f-i---- | 402A 8D   DA     - BSR   4006               ; ----- backwards 
PC=402C X=40B4 Y=0000 U=0000 S=7FFE DP=00 A=08 B=80 CC=-f-in--c | 402C 6D   85     - TST   B,X                ; -aa0- 4034 = 00
... more lines cut

Another issue is dealing with the .assert directive. I have to save the test somehow since the assembler can't do the check when it parses the .assert because not all the code for the test has been assembled yet. I could store the text to the test expression and then evaluate it at run time, but as this code shows, that would mean re-interpreting the text many, many times. No, the solution I came up with is a mini-Forth-like language for evaluating the test expression.

Yup, I'm embedding a mini-Forth interpreter in a 6809 assembler written in C.

A classic blunder I'm sure, like getting involved in a land war in Asia, or going against a Sicilian when death is on the line, but I'm not sure of any other way. The mini-Forth is very small though, only 41 words are defined, but it's enough for my needs. The first .assert expression translates to:

VM_CPUB		( push contents of the B register onto the stack )
VM_LIT 0	( push a literal 0 onto the stack )
VM_NE		( compare the two, leaving a flag on the stack )
VM_EXIT		( exit the VM )

The second one to:

VM_CPUCCz	( push the CC zero flag )
VM_LIT 0	( push a literal 0 )
VM_NE		( compare the two, leave flag on stack )
VM_EXIT		( exit the VM )

And the last one to:

VM_LIT 0x4003	( push the literal 0x4003 )
VM_AT8		( fetch the byte from the 6809 memory buffer )
VM_LIT 0x55	( push the literal 0x55 )
VM_EQ		( compare the two, leave flag on stack )
VM_LIT 0x4004	( push the literal 0x4004 )
VM_AT16		( fetch two bytes from the 6809 memory buffer )
VM_LIT 0xAAAA	( push the literal 0xAAAA )
VM_EQ		( compare the two, leave flag on stack )
VM_LAND		( AND the two results, leaving flag on stack )
VM_EXIT		( exit the VM )

This works, and it was easy to implement the VM. Now all I have to do is parse the expression to assemble the VM code (right now the addresses and VM functions are hard coded into the assembler just to prove it works).

This feature is proving to be an interesting problem.

Monday, Debtember 04, 2023

I've been blogging for 757,382,400 seconds

Wow! It's been a full 365 days since my blog went under the lock! And it's been a full 8,766 days since I started blogging. Here's to another 8,766 days!

The Gopher Situation

Over the past few days, I've been battling a pernicious bug in my gopher server wherein it becomes CPU bound and cause other issues on the server. I then have to go in and kill the gopher process (and the one time I couldn't even do that—I had to have the virtual server restarted). I initially attributed this to an over-aggressive bot crawling my site and blocked it with iptables but even that didn't solve the issue.

The problem is—nothing to my knowledge has changed on my virtual server, nor the server that it is running under, nor the network it's on. My Gemini server gets way more traffic than my gopher site and it's fine, and the only difference between the two is—the Gemini server uses TLS but otherwise, is nearly identical to the gopher server.

It's very odd.

The other day I added some code (in a branch, not in the main line version) to log memory usage, number of threads (technically, Lua coroutines), number of running threads, number of waiting threads, and number of active sockets. And since adding that, the gopher server has been running fine, but just now I do see a potential problem—the number of threads is two higher than the number of actual connections, which “shouldn't” happen.

Woot! I now have a lead on the problem!

But I do wonder what recently caused the issue? The code hasn't changed since April, and now I'm wondering if my Gemini server has a similar issue, since the code bases are similar in nature.

Tuesday, Debtember 05, 2023

Notes on an overheard conversation at 4:15 am

SLAM!

“Hey! I heard the screen door slam! Could you check the front porch?”

“Who would be here at … 4:00 am‽”

“Probably an Amazon delivery. I ordered a white noise machine last night.”

“Seriously?”

“Just go look!”

“Hmm … whoa! Lovely! They left the package right on the door step. Sorry about stepping on it. And yes, it's from Amazon.”

“It arrived! My white noise machine!”

“When did you say you ordered it?”

“9:00 pm. They said it would arrive shortly.”

“That's insane!”

“That's Amazon.”

Wednesday, Debtember 06, 2023

Unit testing from inside an assembler, part III

I'm done with the “unit testing” backend for my 6809 assembler. The mini-Forth engine is working out fine, although the number of words increased from 41 to 47 to support some conveniences (like indexing and string comparison). It took some work to support, but the number of assertions one can make in the code is extensive. For example, a test case for this bit of code (which I do need to discuss, but that's a post for another time) looks like this:

test		sts	[$3333,x]
.next		pshs	pc,u,y,x,dp,b,a,cc

	.test	"STS"

		ldx	#.results
		ldy	#test
		jsr	init

	.assert	/x        = .results , "X=results"
	.assert	/y        = .next    , "Y=next"
	.assert	@@/0,x    = .address
	.assert	@@/2,x    = .opcode
	.assert	@@/4,x    = .operand
	.assert	@@/6,x    = .topcode
	.assert	@@/8,x    = .toperand
	.assert @.nowrite = $12         , "overwrite"
	.assert	@/-47,s   = $01		, "stack mod?"
	.assert	.address  = "0800"z     , "hex address"
	.assert	.opcode   = "10EF"z     , "hex opcode"
	.assert	.operand  = "993333"z   , "hex operand"
	.assert .topcode  = "STS"z      , "decoded opcode"
	.assert	.toperand = "[3333,X]"z , "decoded operand"

		rts

.results	fdb	.address
		fdb	.opcode
		fdb	.operand
		fdb	.topcode
		fdb	.toperand
.address	rmb	5
.opcode		rmb	5
.operand	rmb	7
.topcode	rmb	9
.toperand	rmb	19
.nowrite	nop

	.endtst

The code being tested is a 6809 disassembler written in 6809 assembly code (I wrote that a few years back—any testing now is academic at this point). The .TEST directive takes an optional string as the name of the test. If one isn't given, it will use the last non-local label seen in the source code as the name of the test. The first two lines:

	.assert	/x        = .results , "X=results"
	.assert	/y        = .next    , "Y=next"

assert that the X register points to .results and the Y register points to .next. I use the leading slash to denote a register instead of a label. One can use register names for labels and it's mostly unambiguous as the register is typically part of the mnemonic itself. The only exception is for the A, B and D registers, and then, only in the index addressing mode, as you can use the A, B or D register for an offset. But in the context of the .ASSERT directive it makes it easier to parse the intent if I use '/' to designate a register. Each register, and each bit in the condition code register (like /cc.z for the zero-flag) can be used. The bit after the comma, “X=results”, will be printed if the check fails:

test-disasm.asm:7: warning: W0015: STS:13 X=results: test failed:

(there can be text after the “test failed” bit, thus the colon).

The next few lines:

	.assert	@@/0,x    = .address
	.assert	@@/2,x    = .opcode
	.assert	@@/4,x    = .operand
	.assert	@@/6,x    = .topcode
	.assert	@@/8,x    = .toperand

assert the contents of memory pointed to by X. The double “@” fetches 16 bits from the address following, and in the first line, this is the address in the X register. The second line retrieves the 16 bits from the address two bytes past where the X register points to. You could write these lines as:

	.assert	@@(/x + 2) = .opcode

but a little syntactic sugar never hurts, and it mimics the native method of using the index registers. This was possibly the hardest bit of code to write, as the index addressing mode of the 6809, while great from an assembly programmer's perspective, is a nightmare from an assembler-implementer's perspective. Even here, where it's simplified, was a pain to get right, but I think it was worth it.

The next two lines:

	.assert @.nowrite = $12         , "overwrite"
	.assert	@/-47,s   = $01		, "stack mod?"

check that the given addresses, nowrite and a byte down in the system stack, contain certain 8-bit values. Each byte of the memory in the virtual 6809 system is filled with the value 1 (it can be changed on the command line), so here, each untouched byte will contain a 1. I picked that value since it's an illegal opcode, which the emulator will trap.

The final few lines:

	.assert	.address  = "0800"z     , "hex address"
	.assert	.opcode   = "10EF"z     , "hex opcode"
	.assert	.operand  = "993333"z   , "hex operand"
	.assert .topcode  = "STS"z      , "decoded opcode"
	.assert	.toperand = "[3333,X]"z , "decoded operand"

does indeed, do a string compare. And therein lies a tale. Again, this is a form of syntactic sugar:

	.assert	@.address=$30 && @(.address+1)=$38 && @(.address+2)=$30 && @(.address+3)=$30 && @(.address+4)=0

This was the second hardest bit to to support, is a bit fragile, and, if I'm honest, a hack. The string literal has to be on the right hand side of the conditional, and worse, there's no easy way to enforce this in the assembler (so I currently don't). Third, the second string has to be a literal string—you can't compare two different memory regions from the 6809 VM. There's also a limit of only one string literal per .ASSERT directive, again, because supporting more than one would vastly complicate the already somewhat complicated code (this “unit test“ backend is already 30% of the entire assembler).

To keep from having to add a ton of code for the conditional checks to support two different primitive types, or to keep from having to create a duplicate set of string conditionals, I cheated (or came up with a brilliant hack—take your pick). The code generated is:

VM_LIT .address
VM_SCMP
VM_EQ
VM_EXIT

That VM_SCMP is hiding things—it knows which string literal to use (as it's part of the VM program and there's only space for one string literal per .ASSERT directive) but it also leaves two values on the stack: -1,0 if the result is less than, 1,0 if the result is greater than, and 0,0 if the result is equal. This way, the conditional operators can work as is.

Oh, those “z”s on the end of each string literal? Well, the assembler supports several methods of storing string data in memory. There's the standard C NUL terminated strings; the OS-9 method of setting bit 7 of the last character of the string, and the sometimes used method where the first character of the string is actually the length. I originally had separate non-standard directives to support these methods, so when I wanted to support string-comparisons, I needed a way to support these methods. Then it hit me—the use of a suffix on the string—“Z” for the NUL terminated one (“Z” stands for “zero”), “H” for the bit 7 set (“H” for “high-bit”) and “C” for counted strings. And if I'm using the suffixes for the “unit test” backend, why not in general? So I replaced the .ASCIIZ and .ASCIIH directives (I was contemplating adding counted strings but I never got around to adding .ASCIIC) with just .ASCII and the use of a suffix (no suffix, string is left as-is).

So, back on track. The expressions can get quite involved. Some examples:

 .assert /b             = -(@lfsr & 1) & $B4
 .assert @tvalue        = $10*3+(1<<3)+2*2+(7-5)+1
 .assert @@(tvalue + 1) = $10+3+1<<3+2*2+7-5+1

You are also not limited to using the .ASSERT, .TRON and .TROFF directives inside a .TEST directive. You can put them anywhere in the codebase, and if that code is executed as part of a “unit test”, they'll trigger (and if you aren't using the “unit test” backend, they're ignored outright).

There are other changes too—each backend will parse its own command line options, I added some new warnings (such as a waring for self-modifying code), and the memory of the virtual 6809 can have various protections (read-only, write-only, execute-only, trace) set from the command line for further testing.

Now I just need to update the README.txt file and release the code.

Friday, Debtember 08, 2023

The Gopher Situation, part II Unicode Booglaloo

The lead I thought I had was a red herring. I thought it may have had something to do with reporting errors back to the client as seen from the logs:

Dec 04 21:44:38	daemon	info	71.19.142.20	gopher	maxco=1 runco=0 toq=0 sc=1 mem=3012577
Dec 04 21:46:23	daemon	err	71.19.142.20	gopher	stat("/home/spc/gopher/share/MGLNDD_71.19.142.20_70") = No such file or directory
Dec 04 21:46:23	daemon	info	71.19.142.20	gopher	remote=XXXXXXXXXXXXXXX status=false request="MGLNDD_71.19.142.20_70" bytes=68
Dec 04 21:49:38	daemon	info	71.19.142.20	gopher	maxco=2 runco=0 toq=0 sc=1 mem=2903218
Dec 04 21:50:52	daemon	info	71.19.142.20	gopher	remote=XXXXXXXXXXXXXXX status=true request="CONNECT api64.ipify.org:443 HTTP/1.1" bytes=562
Dec 04 21:54:38	daemon	info	71.19.142.20	gopher	maxco=2 runco=0 toq=0 sc=1 mem=3035838

Notice how maxco (total number of coroutines) increments and stays that way after a failed request. And the evidence was pretty convincing too:

Dec 04 22:19:38	daemon	info	71.19.142.20	gopher	maxco=2 runco=0 toq=0 sc=1 mem=3411531
Dec 04 22:23:44	daemon	info	71.19.142.20	gopher	remote=XXXXXXXXXXXXXXX status=true request="Phlog:2010/03/08" bytes=189
Dec 04 22:24:38	daemon	info	71.19.142.20	gopher	maxco=2 runco=0 toq=0 sc=1 mem=3185119
Dec 04 22:24:39	daemon	info	71.19.142.20	gopher	remote=XXXXXXXXXXXXXXX status=false request="\3\0\0/*?\0\0\0\0\0Cookie: mstshash=Administr" bytes=82
Dec 04 22:25:57	daemon	info	71.19.142.20	gopher	remote=XXXXXXXXXXXXXXX status=true request="Phlog:2006/11/19.1" bytes=1028
Dec 04 22:29:38	daemon	info	71.19.142.20	gopher	maxco=3 runco=0 toq=0 sc=1 mem=3242133
Dec 04 22:33:31	daemon	info	71.19.142.20	gopher	remote=XXXXXXXXXXXXXXX status=true request="Phlog:" bytes=978
Dec 04 22:34:38	daemon	info	71.19.142.20	gopher	maxco=3 runco=0 toq=0 sc=1 mem=3207881

Error reporting with the gopher protocol is clearly an afterthought. The official RFC has two occurances of the word “error” in it—and one of them is redundant. I did read somewhere (that's difficult to find now) that perhaps gopher should simple close the connection upon an error instead of sending an “error” to the client, so I thought I would try that. Instead of sending:

3Selector not foundHTfooHTgopher.conman.orgHT70CRLF

I would just close the connection.

That didn't work. The gopher server was still getting stuck. Attaching gdb to the stuck process didn't show anything, as the Lua executable I was using didn't have debugging symbols. So then I recompiled Lua and the modules used that were written in C to include debugging information and restarted the server, yet again.

So now I think I think I found the root issue. Attaching gdb this time showed the server was stuck in LPEG. Even better, I could see the text it was trying to parse and well … previously I said, “[t]he code hasn't changed since April.” That's not quite true. The server code hadn't changed since April, but an extension had! Back in late October I modified the code that renders my blog on gopher to use Unicode combining characters to do some typographical tricks, and it seems that the code used to wrap the text just … wasn't up to par (Unicode is hard! Let's go to Mars!).

I also noticed that my Gemini server had finally crashed—hard. And I changed that too, to use Unicode typographical tricks. So out it comes!

Let's see if this was the problem.

Update on Tuesday, December 19^th, 2023

It totally was the problem.

Monday, Debtember 11, 2023

Some thoughts on unit testing from inside an assembler

I've been writing some new 6809 assembly code as well as going back to some existing projects, trying out the “unit test” feature from my 6809 assembler. I will admit that running “unit tests” from the assembler is wonderful! It cuts debugging time since the feedback loop goes from “edit code, assemble, load into emulator, run, edit code” to “edit code, assemble, edit code,” which makes it more likely I'll use the feature. Also nice is that when I'm done with the testing, I change the backend and the testing code is no longer part of the program. Yes, the tests still reside in the source code, but they're ignored if not required.

The issue I've always had with “testing über alles” is that it doesn't take the language or tooling into account, and it's the tooling and language support that can make or break “unit tests” (whatever a “unit test” is). Personally, I like it that I can write tests near the code to be tested and have the assembler run them for me (and it seems like the only modern language to get this right is Rust). Having “unit tests” in a separate file, or having to go through several hoops to run the tests is, for me, just too much friction to use unless forced.

As an aside, I'm amazed that IDEs haven't made writing “unit tests” easier, or just write them entirely as they already have information about each function—what they take and what they return. I mean, they already support refactoring, how hard can it be to support automatic “unit tests?” Or is this a thing I'm missing out on because I don't use IDEs?

Tuesday, Debtember 19, 2023

The Gopher Situation, part III, The Search For Uptime

It's been over two weeks and the gopher server has been up and running for all that time. Yup, it was Unicode. Or rather, my inability to wrap Unicode properly.

A bit of background on compilers exploiting signed overflow

Why do compilers even bother with exploiting undefinedness signed overflow? And what are those mysterious cases where it helps?

A lot of people (myself included) are against transforms that aggressively exploit undefined behavior, but I think it's useful to know what compiler writers are accomplishing by this.

TL;DR: C doesn't work very well if int!=register width, but (for backwards compat) int is 32-bit on all major 64-bit targets, and this causes quite hairy problems for code generation and optimization in some fairly common cases. The signed overflow UB exploitation is an attempt to work around this.

Via Comment on ”Bug in my code from compiler optimization [video] | Hacker News”, A bit of background on compilers exploiting signed overflow

A cautionary tale about compiler writers exploiting undefined behavior. I don't have much to add here, other than to spread a bit of awareness of why this happens.

Timing code from inside an assembler

Back in March, I wrote about some 6809 optimizations where I counted CPU cycles by hand. I came across that code the other day and thought to myself, my 6809 emulator counts cycles, and I've embedded it into my 6809 assembler—how hard could it be to time code in addition to testing it?

Turns out—not terribly hard. I added an option to the .TRON directive to count cycles instead of printing code execution and have the .TROFF directive print the cycle count (indirectly, since the code isn't run until the end of the second pass of the assembler). Then I wrote up a few tests:

	.test	"ROM-RAMx1-byte"
		ldx	#$8000
	.tron	timing
r2r1		sta	$FFDE
		lda	,x
		sta	$FFDF
		sta	,x+
		cmpx	#$FF00
		bne	r2r1
	.troff
		rts
	.endtst

;*****************************************************************

	.test	"ROM-RAMx2-byte"
		ldx	#$8000
	.tron	timing
r2r2		sta	$FFDE
		ldd	,x
		sta	$FFDF
		std	,x++
		cmpx	#$FF00
		bne	r2r2
	.troff
		rts
	.endtst

;*****************************************************************

	.test	"ROM-RAMx4-byte"
		ldx	#$8000
	.tron	timing
r2r4		sta	$FFDE
		ldd	,x
		ldu	2,x
		sta	$FFDF
		std	,x++
		stu	,x++
		cmpx	#$FF00
		bne	r2r4
	.troff
		rts
	.endtst

;*****************************************************************

	.test	"ROM-RAMx8-byte"
savesp		equ	$0100
		orcc	#$50
		sts	savesp
		lds	#$FF00 - 8
	.tron	timing
r2r8		sta	$FFDE
		puls	u,x,y,d
		sta	$FFDF
		pshs	u,x,y,d
		leas	-8,s
		cmps	#$8000 - 8
		bne	r2r8
	.troff
		lds	savesp
		andcc	#$AF
		rts

	.endtst

And upon running it:

GenericUnixPrompt% a09 -ftest r2r.asm
ROM-RAMx1-byte:13: cycles=877824
ROM-RAMx2-byte:28: cycles=487680
ROM-RAMx4-byte:45: cycles=357632
ROM-RAMx8-byte:64: cycles=199136

The results match what I calculated by hand, so that's good. It also found a bug in the emulator—I had the wrong cycle count for one of the instructions. It's a bit scary how easy it has become to test 6809 assembly code now that I can do much of it when assembling the code.

Discussions about this entry

Discussions about this entry

Discussions about this entry

Discussions about this entry

Discussions about this entry

Random Access

Discussions about this entry

Discussions about this entry

Update on Thursday, March 2nd, 2023

Discussions about this entry

Discussions about this entry

Discussions about this entry

Discussions about this entry

Update on Tuesday, October 10th, 2023

Discussions about this entry

Discussions about this entry

Update on Friday, December 8th, 2023

Update about an hour later …

Discussions about this entry

Discussions about this entry

Discussions about this entry

Update on Tuesday, November 28th, 2023

Update on Thursday, March 2^nd, 2023

Update on Tuesday, October 10^th, 2023

Update on Friday, December 8^th, 2023

Update on Tuesday, November 28^th, 2023