Saturday, January 01, 2022
Ramming in the New Year
HAPPY NEW YEAR!
Janurary 1st is the other fireworks happy day, but this year it's been relatively quiet. There were some fireworks, but there were far off in the distance and not across the street.
There was, however, something new—a neighbor (and not the one who usually does fireworks) bringing in the New Year with a ram's horn.
Monday, January 03, 2022
Back to the saltmine, where all my passwords have expired
It's back to the saltmine (and in this case, “saltmine” is the name of the Corporate issued laptop, not to be confused with the Corporate Overlords' managed laptop, named “Satan”). I check my email only to find half a dozen emails from last week (which nearly everyone at The Corporation had off, including me) saying my password for the Corporate network was expiring and I should change it. I also found half a dozen emails from last week (which nearly everyone at The Overlords' Corporation had off, including me) saying my password for the Corporate Overlords' network was expiring and I should change it (yes, there are two different networks for hysterical reasons). And of course, the two different networks have different password rotation lengths that are timed such that they both expire during vacations. And yet, no matter how many times I point out NIST Special Publication 800-63b, section 5.1.1.2, which states: “Verifiers SHOULD NOT require memorized secrets to be changed arbitrarily (e.g., periodically),” these stupid password expirations keep happening. I guess I'll have to wait for another few CSOs to rotate through the office before we can finally stop the “Password Changing Dance.”
Sigh.
“What idiot did this? Oh … said idiot is me.”
Not only is “Project: Bradenburg” being built from git
,
but it seems operations has finally gotten “Project: Seymore” build servers working with git
.
I decided to check it out and … what the … ?
What did they do to the code?
Submodules?
Outrageous!
I poked a bit deeper, and it turns out, yeah, that idiot was me! I had completely forgotten the eldritch horrors I unleashed nearly two years ago. Oh, is 2022 going to be a fun year …
Tuesday, January 04, 2022
Fixing bugs by cleaning code
I figured out an approach to “Project: Bradenburg.” I'm dumping the C++ code as it was never used in production and as someone who is more confortable with C than C++, I think that's the best choice right now. To that end, I've made a checklist of the items that need addressing right now—and they're all related to cleaning up the code.
I've normalized the whitespace in all the source files.
Tab characters were introduced by the previous developer,
who used a different tab stop setting than I'm used to,
so the code formatting is just way off.
I removed the #pragma
declarations and fixed all the resulting warnings.
And for the warnings I don't necessarily agree with,
like -Wformat-truncation
,
I can supress in the makefile
(well, GNUmakefile
if you want to be pedantic):
src/app/stormgr.o : override CFLAGS += -Wno-format-truncation
This disables the warning
src/app/stormgr.c:194:3: note: ‘snprintf’ output between 2 and 4097 bytes into a destination of size 4096
from code like this:
snprintf(filename, sizeof(filename),"%s/", GetGlobalConfig()->spool_dir);
It's nice I get the warning, but in this case, if we get a filename of 4097 characters long we have other issues to worry about.
And that's pretty much the only warning I have to supress. The other warnings were all issues that needed to be addressed, and in one case, an actual bug was fixed.
Right now I'm cleaning up what is,
to me,
a pointless abstraction dealing with syslog()
.
The previous developer wrote a wrapper around syslog()
called SysLog()
.
It would be one thing if there was some reason to wrap syslog()
,
like we would be running on Windows
(we won't)
or needed to redirect output to a file
(we don't).
But no,
all the wrapper does is:
void SysLog(int priority, const char *fmt, ...) { va_list marker; if (fSyslogOpened) { va_start(marker, fmt); vsyslog(priority, fmt, marker); va_end(marker); } }
And the funny thing—syslog()
already handles the case when openlog()
hasn't been called.
So there's no reason for this wrapper whatsoever in this code base.
What makes this even more special is how the developer called SysLog()
:
SysLog(LOG_NOTICE, BRADENBURG5024, dbuf);
And elsewere in the code in some header file:
#define BRADENBURG5024 "BRADENBURG5024: OUTBIND connection accepted from %s"
I … um … I … erm … yeah. I can make sense of this, in the “we might want to translate the log messages to another language” argument. But we don't sell this product—it's for internal use. And there are other ways to go about doing this rather than separate the format string from its use. Nothing like finding nine instances of this:
SysLog(LOG_EMERG, BRADENBURG0001);
where BRADENBURG0001
is defined as:
#define BRADENBURG0001 "BRADENBURG0001: %s"
For those unwise in the ways of C programming, this is calling a function with effectively a missing parameter and the compiler can't warn about it because the format string (which informs the called function about what parameters to expect) exists in a different file.
So just by cleaning up the code and removing pointless abstractions I'm fixing bugs.
Wednesday, January 05, 2022
Meeting my new manager before training my new manager
I finally met my new manager! It's been … what? 3½ months? … since it was announced. I decided to ask a VP of the Corporate Overlords who was my actual manager, M1 (who was promoted) or M2 (who is to replace the promoted manager). The VP said M2, and that since I have yet to meet him, I should invite him to the next department meeting. Why it should be up to me to invite M2 to our daily meeting and not M1 is apparently beyond my pay grade, but I invited him.
And it turns out the situation is still … complicated. M1 is still my “manager” for some things for the forseeable future, and M2 is my “manager” for the other things not handled by M1 for the forseeable future. But overall, the meeting went well. We shall see how things go.
Friday, January 07, 2022
“You can't fix the bug until you have filled out form 27B/6.”
I swear, I just cannot adjust to the new development process. Last month, as I was starting work on “Project: Bradenburg,” I ended up using GCC 11, and just on a whim, I decided to try GCC 11 on “Project: Lumbergh” and that's when I found an issue—“Project: Lumbergh” crashed immediately. It took about ten minutes to track down the issue—undefined behavior! Effectively, “Project: Lumbergh” was taking a pointer to a variable in an inner scope and using it outside said scope (don't worry, there's an example of what I'm talking below). I duly recorded a bug in Jira, but did not check in the patch as the other developer, CZ, was doing a bunch of work on “Project: Lumbergh” and I wanted to run the bug by him first just so he knew about it. In retrospect, I think I should have just checked in the patch and asked forgiveness, because the ensuing “process” nearly killed me.
First off, the bug basically languished due to the holliday season and everybody taking vacation and what not. When I brought up the bug on Monday, CZ demanded a test case to show it failing. Problem—there was no reason to create a specific test case, because any test would have shown the crash. Second problem—it's undefined behavior—it only shows (as far as I know) when compiled with GCC 11. We don't use GCC 11 in production, and the GCC version we use in production produces code that runs, thus not finding the bug any ealier.
CZ tried using valgrind
to locate the issue and couldn't.
Of course CZ couldn't—the executable worked.
As I tried to state,
it only fails when compiled with GCC 11
(which we as a company don't use yet).
I spent the week trying to convince him to just accept the patch because it removed undefined behavior. CZ was adamant in trying to reproduce the issue for himself. Because with our Corporate Overlords, all bugs must be reproducable, must have a test case, and must pass said test case, before it will be patched.
Even in the case of blatant undefined behavior!
ARE YOU XXXXXXX KIDDING ME? IS OUR CODE SO XXXXXXX FRAGILE THAT MOVING THREE VARIABLES DECLARATIONS WILL XXXXXXX BREAK THE PROGRAM? IS THAT WHAT YOU ARE SO XXXXXXX AFRAID OF? XXXXXXXXXXXXXXXXXXXX!
[Calm down! Breath! In … out … in … out. —Editor]
Sigh.
Eventually, I was given the go-ahead to submit the patch. Here was my revision notes:
Bug fix XXXXXX—remove undefined behavior
This was found using GCC 11, and it's very hard to reproduce, since the code invoked undefined behavior by storing pointers to data in a scope that technically no longer exists when it's referenced.
The reason this is hard to find depends upon how the compiler lays out the stack frame for the given function. It can, for instance, collect all the variable created at all the scopes in the function and set aside space for all of them at once, which in that case the code will work without issue. Or it could collect all the variables from all the scopes, and create just enough space for all the variables, allowing the variables from one scope to reuse memory for another scope, in which case, it may work, or not, depending upon intervening scopes. Or, the compiler can create the scoped variables when the code enters the scope, in which case the code may lead to undeterministic behavior, depending upon the code following the scope.
An example:
struct bar { char b[MAX]; }; struct foo { char f[MAX]; struct bar *sub; }; static int foo(A a, B b, C c) { struct foo f; bool flag; flag = to_display(f.f,sizeof(f.f),a); if (flag) { struct bar b; b.b = to_other_display(b.b,sizeof(b.b),b,c); f.sub = &b; } else f.sub = NULL; if (check_this()) { int x = tointeger(b); char buf[MAX]; to_c_display(buf,sizeof(buf),c); syslog(LOG_DEBUG,"check=%d c=%s",x,buf); } do_something(&f); return 0; }In this sample function, compiler A may collect all the local variables into a single block at the start of the function:
struct foo f; bool flag; struct bar b; int x; char buf[MAX];and even though this provokes undefined behavior, it will still work, and valgrind won't find an issue. Compiler B might create space for all the variables, but could reuse the space for the two sub blocks:
struct foo f; bool flag; union { struct bar b; struct { int x ; char buf[MAX]; } };(not valid C syntax, but it should give the intent I'm going for). In this case, the space pointed to by b could be overwritten if check_this() returns true. Compiler C may only create the space as needed, so the resulting assembly code may look something like:
foo push rbp mov rbp,rsp sub rsp,sizeof(struct foo) + sizeof(flag) + padding … cmp al,[rbp+flag] jz foo_skip sub rsp,sizeof(struct bar) + padding … lea rbx,[rbp - f]; lea rdx,[rbp - b]; mov [rbx + sub],rdx … add rsp,sizeof(struct bar) + padding foo_skip: … call check_this tst al jz foo_skip2 sub rsp,sizeof(int) + sizeof(buf) + padding … add rsp,sizeof(int) + sizeof(buf) + padding foo_skip2: call do_somethingAnd in this case, if check_this() even returns false, data in struct bar could be overwritten just with the call to check_this() itself.
This is the very definition of “undefined behavior” and in this case, it's might have been hard to find, even with tools like valgrind, since it really depends upon the code generated by the compiler. I think we're lucky in that GCC 11 layed out the code such that it crashed and thus, I was able to isolate the issue long before it becomes important.
Tuesday, January 11, 2022
Let's look at some bots that aren't the MJ12Bot
I think it's time I stop blogging about work after my previous post. Work is getting a tad too depressing to think about and my cynical side is saying that it won't matter where I go, it'd be more or less the same with a higher probability of forced Microsoft Windows use. So instead of that depressing topic, let's take a look at something much lighter and less depressing—the current state of Internet robots crawling my various sites!
Two weeks later and there are still bots attempting to follow endless redirections. I thought maybe I could attempt to figure out a contact, but alas, they're coming from all over the place (and yes, I'm finally naming IP addresses):
IP address | # requests |
---|---|
18.134.208.136 | 933 |
18.132.248.127 | 850 |
3.8.92.131 | 817 |
18.169.194.52 | 745 |
3.8.210.87 | 728 |
13.40.97.54 | 715 |
18.170.56.106 | 713 |
3.8.134.65 | 682 |
35.176.22.93 | 681 |
18.130.231.183 | 681 |
13.40.67.85 | 667 |
13.40.137.233 | 666 |
18.132.46.166 | 659 |
3.8.24.209 | 641 |
18.170.107.207 | 637 |
13.40.155.157 | 634 |
35.178.170.215 | 577 |
18.130.216.34 | 573 |
13.40.145.207 | 572 |
35.179.76.79 | 564 |
They're all pretty much from Amazon Web Services so who knows who is running these bots. Just blocking them is too easy a solution—at this point, I'd like to do something to get their attention (as if thousands of links they are crawling are suddenly listed as “gone” isn't enough of a clue). I don't necessarily mind bots crawling my sites, unless they're doing stupid things. I shall have to think on this a bit more.
I also had high hopes that I could stop empty requests to my Gemini server (which isn't allowed at all by the specification) by returning a non-standard response code with the text “Not a gopher server” but alas, that is still happening. Does nobody bother checking results of their bots running? I guess not.
And speaking of gopher, it's better there than Gemini. Yes, there are a few agents that are attempting to use TLS, but fortunately, they cache previous failures so it's not every request. There are a few bots out there trying to exploit RDP (not much I can do about those) and a few that are confused into thinking my gopher site is actually my Gemini site sans TLS (What?). But I can live with 155 failed gopher requests out of 10,423 over the past month.
And while I'm checking bots, I can't forget the web crawlers. And not much has changed on that front since July 2019 except that MJ12Bot has kept their promise never to crawl my site again. The Knowledge AI (which I cannot find any information on) is still the number one agent, with 68,000 requests in Debtember 2021, followed by 21,000 requests from Amazonbot. And it seems that the bots in general are making fewer requests to non-existant pages (I mean, back in June 2019, The Knowledge AI made 170 bad requests; last month, 1).
So, with the exception of bots stuck in redirection Hell in Gemini, things on the crawler front are looking pretty good.
Sunday, January 16, 2022
A most persistent spam, part VI
It seems that “Aleksandr” may have changed his name to “Mayboroda,” but it looks like it's the same type of weird spam I've since blocked successfully. Only here, reader Roberto found a way to block the spam for users of Postfix (and I did get Roberto's permission to post this email):
- From
- Robysampler <XXXXXXXXXXXXXXXXXXXXX>
- To
- sean@conman.org
- Subject
- About "Mayboroda_aleks" on your personal blog
- Date
- Sun, 16 Jan 2022 23:04:07 +0100
Dear Mr. Sean
My name is Roberto from Italy.
i've read your personal blog about the mayboroda aleks spammer, who's bothering me, filling my own company email since one and half years, at least.
as you figured out "Mayboroda", keeps changing IPs and domain/subdomains to evade every try to block him.
luckly, my company mail is served by a linux machine i own, so i have direct access to it, and as final solution i've choose to do some fine tuning in postfix config.
i've add inside postfix "main.cf" file:
smtpd_recipient_restrictions = check_sender_access regexp:/etc/postfix/rejected.sendersthen i've add in "rejected.senders":
/s[0-9]{1,2}.[a-z]*.ru/ REJECT /info@.[a-z]*.ru/ REJECTin this case you'll provide to your postfix daemon, some rejecting rules based on regular expressions.
based on hundreds of mails "Mayboroda" has sent me, i figured out the main pattern for his emails usually are
info@randomdomain.ru
or
something@s(1 or 2 numbers).randomdomain.ru
after setting up your postfix you can check out the result using the command
postmap -q "your test email here" regexp:/etc/postfix/rejected.sendersfor example
postmap -q "info@s4.mayboroda.ru" regexp:/etc/postfix/rejected.sendersthe shell returns
REJECT
this will works until "Mayboroda" will continue to use the same pattern in the mail sender
I hope you'll appreciate my advices.
have a nice day and happy new year
Roberto
Best Regards
I do appreciate your advice, Roberto. Thank you. I'm sure other people will find this useful as well.
Monday, January 17, 2022
A most persistent spam, part VII
I received a follow-up message from Rooberto about the “Aleksandr Russian spam emails:
- From
- Robysampler <XXXXXXXXXXXXXXXXXXXXX>
- To
- Sean Conner <sean@conman.org>
- Subject
- Re: About "Mayboroda_aleks" on your personal blog
- Date
- Mon, 17 Jan 2022 17:33:35 +0100
Hi Sean.
Thanks very much for your fast reply.
i have some good news about "Mayboroda"
here some lines of my postfix log showing "Mayboroda" has tryed again, sending me some spam today:
Jan 17 11:48:47 mydomain postfix/smtpd[23894]: warning: hostname tefalongo.ru does not resolve to address 185.186.3.10 Jan 17 11:48:47 mydomain postfix/smtpd[23894]: NOQUEUE: reject: RCPT from unknown[185.186.3.10]: 450 4.7.25 Client host rejected: cannot find your hostname, [185.186.3.10]; from=<info@s7.kroshem.ru> to=<booking@mydomain.net> proto=ESMTP helo=<s7.kroshem.ru> Jan 17 12:18:49 mydomain postfix/smtpd[24258]: warning: hostname tefalongo.ru does not resolve to address 185.186.3.10 Jan 17 12:18:49 mydomain postfix/smtpd[24258]: NOQUEUE: reject: RCPT from unknown[185.186.3.10]: 450 4.7.25 Client host rejected: cannot find your hostname, [185.186.3.10]; from=<info@s7.kroshem.ru> to=<info@mydomain.net> proto=ESMTP helo=<s7.kroshem.ru> Jan 17 12:18:49 mydomain postfix/smtpd[24258]: NOQUEUE: reject: RCPT from unknown[185.186.3.10]: 450 4.7.25 Client host rejected: cannot find your hostname, [185.186.3.10]; from=<info@s7.kroshem.ru> to=<booking@mydomain.net> proto=ESMTP helo=<s7.kroshem.ru> Jan 17 12:48:49 mydomain postfix/smtpd[24629]: connect from s7.kroshem.ru[185.186.3.10] Jan 17 12:48:49 mydomain postfix/smtpd[24629]: NOQUEUE: reject: RCPT from s7.kroshem.ru[185.186.3.10]: 554 5.7.1 <info@s7.kroshem.ru>: Sender address rejected: Access denied; from=<info@s7.kroshem.ru> to=<info@mydomain.net> proto=ESMTP helo=<s7.kroshem.ru>in particular the last line shows that the regular expression has found a match on "info@s7.kroshem.ru" and replyed "Sender address rejected: Access denied" and
REJECTED
the incoming Email.there are some other tweaks you can implement into your "main.cf" postfix configuration file that will help you to avoid junk emails
the following is a partial extract from my postfix "main.cf" configuration:
smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, check_sender_access regexp:/etc/postfix/rejected.senders, #check recipients by regular expression check_policy_service unix:private/policyd-spf, reject_rhsbl_helo dbl.spamhaus.org, #check if domain or ip is flagged as spam in spamhouse database reject_rhsbl_reverse_client dbl.spamhaus.org, #check if domain or ip is flagged as spam in spamhouse database reject_rhsbl_sender dbl.spamhaus.org, #check if domain or ip is flagged as spam in spamhouse database reject_rbl_client zen.spamhaus.org #check if domain or ip is flagged as spam in spamhouse database smtpd_sender_restrictions = permit_mynetworks, permit_sasl_authenticated, reject_unknown_reverse_client_hostname, #Reject the request when the client IP address has no address->name mapping. reject_unknown_client_hostname, #Reject the request when 1) the client IP address->name mapping fails, or #2) the name->address mapping fails, or #3) the name->address mapping does not match the client IP address. reject_unknown_sender_domain #Reject the request when Postfix is not the final destination for the sender addressMany of these tweaks i've implemented were taken from the document at the following webpage:
http://www.armellin.com/friends/postfix/postconf.5.html
Feel free to publish our conversation in your blog as you wish.
It's nice to help other people to get rid of the plague of "Mayboroda" :D
Thanks Sean
Best Regards
Roberto
Thank you again, Roberto.
Tuesday, January 18, 2022
Extreme tourism, Cashiers, North Carolina edition
The Marginalia Search Engine is such a cool search engine. It's like visiting the web in the 90s with the quirky personal sites with ugly designs and no advertising. Try out some random sites.
Very cool.
While there, I decided to see what results it had for Brevard, NC and boy, some of the results are incredible. I did not know that Cashiers, North Carolina is a hotbed of UFO activity. Had I know, I might have tried looking for an alien license plate at (what is now) the Cashiers Valley Smokehouse. Although, had we delayed our 2015 visit to Brevard to mid-November, we might have seen an actual UFO!
Darn our luck!
I'm not saying it's aliens, but … it's aliens
Right after finding out about Cashiers being a hotbed of UFO activity, my keyboard flaked out on me. The “o,” and “r” keys would register sporadically, which is odd given that I only use IBM Model M keyboards. And why then? Is it something they don't want me to know? Or maybe … aliens? Very odd.
I've probably been using that keyboard for nearly twenty years (it probably just needs good cleaning, but it requires a special screw driver to open), but fortunately, I have an entire stash of IBM Model M keyboards at hand to swap out. so I think I'm set for the rest of my life Because have you seen the prices for IBM Model M keyboards? Averaging around $150, with spikes up to $650! Insane! I never paid more than $10 for any of mine.
I pulled another keyboard from my stash, gave it a light dusting, and I'm good to go.
I'm still amazed this isn't Stevie Wonder
Today I learned that the song “Virtual Insanity” is not by Stevie Wonder, but Jamiroquai. The video is also quite insane, what with Jamiroquai sliding around the room, along with the furniture—trippy stuff, and something I had to find out how it was done. And like all magic tricks, it's both insanely simple and yet, amazing how well it's pulled off.
Wednesday, January 19, 2022
“The master then called out to three senior monks, to attend the example of the bridge builder, and to hear of the discipline of a true engineer.”
Surfing the web, I came across “The Codeless Code: Case 154 A Bridge To Nowhere”. It hit pretty hard, especially given the situation at The Corporation, despite repeated exhortions about how well things worked in the past and how not so well things are working now, they aren't going to change course—the process of the process is the process and all that.
It was the final paragraph of “A Bridge to Nowhere” where I enlightened, and laughed out loud. We'll see how long that lasts.
Thursday, January 20, 2022
Notes on an overheard conversation at a medical lab
“Which arm?”
“It's a pain either way.”
“Okay, we'll do this one.”
“Are you attempting to amputate my arm?”
“Sigh. No. Please make a fist.”
“Aaaaaaaaaaah!”
“That was just the achohol wipe, sir.”
“Are you sure?”
“This is the needle.”
“Gaaaahhhhhhhhhhhhhhhh!”
“Hmm … ”
“Aaaaaaaaarrrrrrrrraaaaahhhhhhhhhh!”
“I think I'll have to try the other arm.”
“I'm dying here!”
“What did I do to deserve this?”
“Aaaaaaaaaaaaaaaaaaaaaah!”
Wednesday, February 09, 2022
Chicken taco masala
Bunny and I found ourselves out and about and looking for dinner. Our first choice for dinner was “permanently closed” (pity, we liked that place). Our second choice was in the process of closing (for the night—not permanently). Frantic, I did a search for restaurants in our immediate area and that's how we found ourselves at Taco Masala.
Indian-Mexican fusion cuisine?
Sure, why not?
Okay, it's more Indian than Mexican, but still, it's not every day that I have Chicken Taco Masala (it's not called that on the menu, but seriously—it should be). And as far as Chicken Tikka Masalas go, it's good. Very good.
But a warning—their “medium” tends towards the hotter end of “medium,” and that's really the only complaint we had about the place.
Friday, February 11, 2022
Stockholm Agile
After spending the month of January yelling at anybody who would listen that The Process™ is not working (and that list includes my new manager and a Corporate Overlord Vice President, who is now listening in on our daily scrum meetings due to some disasterous deployments last year when the Corporate Overlords pretty much took over our department when our previous previous manager retired). It's been made clear (not in a bad way—I wasn't disciplined nor had to talk to HR about my concerns) that The Process is The Process and upper management is perfectly fine with our current course. And I'm finally coming to accept the things I cannot change, that I told upper management of my concerns (several times) so I did what I could. I think I'm finally accepting the fact that I am now doing “enterprise development,” which includes “Agile developement” and seven weekly departmental meetings (down from eleven—seriously!).
If only teaching paid more than it does …
I was in a nearly three hour meeting today (second of three), doing what is called a “transfer of knowledge.” I'm the only developer left on my team who actually knows how “Project: Wolowizard,” “Project: Sippy-Cup,” “Project: Lumbergh,” “Project: Cleese” and “Project: Seymore” all fit together in production (even if I don't fully understand all the business logic implemented by “Project: Lumbergh”). So I spent my time mostly talking and answering questions from the other developers, including the team leader. Bunny was concerned that it might lead to me being let go, but just for my own sanity (and because plenty of other people at The Corporation have told our Corporate Overlords that under no condition should I be let go) I decided not to sink into cynical dispair and treat it for what it is—getting some other developers up to speed on the various components.
At the end of the meeting, one of the new developers asked if I used to teach, or if I teach on the side, since my explanation of how everthing fit together was coherent and understandable, unlike instructors at his school. I told him that no, I've never been an instructor, nor do I teach on the side. He went on to say I have a gift in explaining things.
I'll take that as a win.
Get thee behind me, Satan
In my third meeting of the day with my latest new manager, I got some incredibly good news—I might be able to return Satan, the useless Windows Laptop whose only purpose is to be continuously updated when Network Security sends me an email saying it's 20 minutes out of date and needs to be updated and get to it, chop chop! The Corporate Overlords are planning on sending a Mac laptop that can get on both the Corporate VPN and the Corporate Overlords' VPN (but not at the same time—that's not allowed) which obviates the need for Satan the Windows laptop. Woot!
If a bug is found and no one cares, is it still a bug?
I've been enjoying reading The Codeless Code over the past month. It's been helping me to accept the process, but I'm not entirely sure what to make of “The Codeless Code: Case 219 Nothing Really Matters,” especially in the light of a recent event at work dealing with undefined behavior. To place the story in context, I'm the senior monk Wangohan, my team leader is the junior monk, yet has the attitude of master Kaimu.
The story is to inform us that since the language and and the tooling around the language didn't warn of the issues, then the junior monk was right in ignoring them, and Wangohan was wrong for chastising the junior monk. But in my case, it was undefined behavior—tools might not catch it! And I'm already weary of my team leader's dependence upon testing, using it (in my opinion) as a crutch.
I have to meditate on this.
Wednesday, February 16, 2022
A call for Gemini without TLS
There's currently quite a bit of talk in the Gemini community about dropping TLS support, or at least make a non-TLS version of Gemini available. I find this amusing since the entire reason for TLS in Gemini in the first place is that the creator of Gemini, solderpunk, wanted to add TLS to gopher. So when he designed Gemini, it started with TLS at the base. But over the years, the collection of people who want to remove TLS from Gemini come in two groups. The first group are the ones that wish to replace TLS with some other encryption scheme, because TLS sucks or is too complicated or subject to insecurities with the certificate authorities. As I stated a few weeks ago on Hacker News:
I think it even applies to “never implement crypto on your own”—are you sure you've taken into account side-channel attacks? Timing attacks? Random number generation (if it's required)? Cleaning memory after use? That
memset()
isn't optimized out? There's a lot to get right …
https://news.ycombinator
.com/item?id=30092091
(The whole thread is interesting to read)
I do recall on the Gemini mailing list (when it was available) that one person said TLS should be replaced, did an actual implementation of an alternative encryption scheme and decided that wasn't such a good idea after all. The conversation pretty much died after that (imagine that!).
The second group of anti-TLS people also argue that TLS sucks or is too complicated or subject to insecurities with the certifacate authories, and just want TLS removed entirely—go plain text. Well, that currently exists—gopher. It's even easier than Gemini sans TLS—there's no URLs to parse or relative links to resolve. Also, just because TLS is a third- party library isn't an argument I would make because while TCP comes with operaing systems today, that wasn't always the case. Back in the 90s, the entire TCP/IP stack was at one point a third-party library for the most popular operating systems that weren't derived from Unix. And today it's the case that the new Google hotness, QUIC, is a protocol only available as a third-party library. No, a better argument is that current TLS libraries suck to use, and it's hard to know which ones to use. That agument, I can sympathize with.
Furthermore, let's say Gemini never specified TLS to begin with. I guarentee you that someone shortly after it appeared would be screaming for TLS to be added, because “encrypt all the things! Why didn't you bake in TLS from the start? Why do you hate us?”
You just can't win here.
Thursday, February 17, 2022
Congratulations Apple! You made a laptop as annoying to use as Microsoft Windows
The new Mac laptop arrived today. I will say that Apple packaging is top notch and beautiful in a Zen-simplistic way. Just beautiful. The laptop itself? Eh … it's physically smaller than my previous laptop, and it has that insipid touch bar across the top of the keyboard. And the less said about that keyboard, the better. Upon turning the laptop on, I was beset with pleading messages to update software from Apple, the Apple Store and from … Microsoft?
Sigh.
Thing is only thirty minutes old, so of course it's ancient and not worth the brushed aluminium it's made from. Burn it to the ground, salt the earth, and start over again.
Damn this upgrade treadmill.
If that wasn't bad enough,
Apple is now copying Microsoft's playbook and made a ton of gratuitous changes to the various system configuaration windows,
making it all the more difficult to adapt the settings to how I like them.
For a period of time,
it looked as if my entire work flow just wouldn't be possible
(mainly, I log into my work Mac laptop via ssh
and do all my development via the command line with a Real Keyboard™
and not the utter XXXX that is passed off as keyboards on laptops of any make these days),
but I did find the tweakable option to enable ssh
access.
Of course,
twenty seconds of not using the physical laptop will cause it to sleep,
thus making it unresponsivle via ssh
and none of the settings I've tweaked has stopped it from doing that.
XXXX!
So while I have a ton of corporate crap like Microsoft Word, Excel, PowerPoint and big brother-esque spyware to prevent me from “infecting” the computer, it did not come with any development software on it. Of course it didn't! Who uses computers for development these days? Granted, if I purchased said laptop, I would expect to have to install the development software, but for a corporate pre-installed laptop for my job? That involves writing software? I have no words.
So yeah, I no longer have to use Satan, the useless Windows Laptop. Now I have to use Belial, the annoying Mac Laptop. I suppose I could try asking for a Linux laptop … yeah, who am I kidding? That will never happen.
Monday, February 21, 2022
This will take some time
Belial, the annoying Mac Laptop proved to be too annoying to use last week, so I left it off to do some actual work on Friday. I set aside today just to do updates, and the first thing was to install the developer tools, per the popup message when I tried running the C compiler. The initial time estimate to download and install the developer tools? 104 hours—over four days. In reality, the estimate kept bouncing up and down quite a bit—Apple certainly took a lot of notes from Microsoft—and it ended up taking a few hours. But hey, the process is the process, and we must process the process to ensure the process has been processed, right? Right.
Next up was Microsoft Teams. We use it for meetings and it refused to run on the new Mac laptop until its been updated. Of course. So let that run its course over a few more hours and hurrah! It runs! So now I can attend meetings on Belial, the annoying Mac Laptop. Hurrah?
I also had to update some security software running on the box. I swear, it seems like my job is to just update the corporate laptops.
Tuesday, February 22, 2022
Because each update takes longer than 20 minutes, it's obsolete upon finishing, so more updates—welcome to the update treadmill
Day two of updates on a day filled with twos.
No, seriously. I'm still updating software on Belial, the annoying Mac Laptop. This time the rest of the Microsoft suite of programs. That took an additional half day to finish (party because there was a lot to download, and party because the Microsoft update program would spazz out and stop downloading at random), but in the meantime, I can use the laptop to attend meetings (sigh). On the plus side, one day closer to Satan, the useless Windows Laptop being sent back from whence it came.
I also got the laptop on both the Corporate Overlords' VPN (the entire reason for the useless Windows laptop) and the Corporate VPN. Of course, you can't be on both at the same time. And of course, the Corporate VPN isn't in the list of default VPNs on the software being used, so I have to type in the server every time I want to use the Corporate VPN, which is daily. It's not like The Corporation is a second class citizen in our Corporate Overlords' eyes or anything. No, not at all.
Sigh.
Wednesday, February 23, 2022
How I spent my day updating yak shaving
Updates.
That's all I've been doing this week on Belial, the annoying Mac Laptop.
Updates.
So I'm all ready to checkout our source code repositories:
[sconner]belial:~/repo>svn checkout https://www.example.com/path/to/repo -bash: svn: command not found [sconner]belial:~/repo>
Seriously?
I install the developer tools, and Subversion is not installed?
[sconner]belial:~/repo>git usage: git [--version] [--help] [-C <path>] [-c <name>=<value>] [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path] [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--bare] [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>] <command> [<args>] These are common Git commands used in various situations: start a working area (see also: git help tutorial) clone Clone a repository into a new directory init Create an empty Git repository or reinitialize an existing one work on the current change (see also: git help everyday) add Add file contents to the index mv Move or rename a file, a directory, or a symlink restore Restore working tree files rm Remove files from the working tree and from the index sparse-checkout Initialize and modify the sparse-checkout examine the history and state (see also: git help revisions) bisect Use binary search to find the commit that introduced a bug diff Show changes between commits, commit and working tree, etc grep Print lines matching a pattern log Show commit logs show Show various types of objects status Show the working tree status grow, mark and tweak your common history branch List, create, or delete branches commit Record changes to the repository merge Join two or more development histories together rebase Reapply commits on top of another base tip reset Reset current HEAD to the specified state switch Switch branches tag Create, list, delete or verify a tag object signed with GPG collaborate (see also: git help workflows) fetch Download objects and refs from another repository pull Fetch from and integrate with another repository or a local branch push Update remote refs along with associated objects 'git help -a' and 'git help -g' list available subcommands and some concept guides. See 'git help <command>' or 'git help <concept>' to read about a specific subcommand or concept. See 'git help git' for an overview of the system. [sconner]belial:~/repo>
So you have git
but not Subversion.
Oh! You removed Subversion from the developer tools!
Lovely.
Oh, I can install it with MacPorts? Cool.
belial:~ root# port install subversion -sh: port: command not found belial:~ root#
Oh. Okay. I see. How do I get it installed? Oh, I need to install Xcode and the Xcode command line tools. I just have the command line tools.
Oh, XCode is only available via the Apple Store. I don't have an account to use the Apple Store. I mean, I do, but there's no way in XXXX I'm going to use my private account for work. Let me see if I can compile Subversion from source.
I'll spare you the details—I can't. Subversion requires The Apache Portable Runtime Project and that project can't quite figure out the system and I'm not versed enough (nor paid enough) to debug autoconf tool issues.
When I ask the Corporate Overlords about updates via the Apple Store, I'm told that that particular issue hasn't actually been hashed out yet. Nice to know that Mac users aren't second class citizens in our Corporate Overlords' eyes. I end up creating a new account for use with the annoying Mac laptop and hope any changes to my credit card can get expensed.
Oh, XCode is over 12G in size?
Sigh.
The initial estimated 104 hours to update is actually turning out to be rather accurate.
Of course I'm opinionated—if I wasn't, I would be in a cult
One of the other developers on my team ran part of our code through SonarQube and well … I have issues with its issues with our C code.
Out of the 600+ issues it flagged,
about 500 or so seem to be related to the use of restrict
in the code.
For example (using my own code):
int btm_cmp( struct btm const *restrict d1, /* it doesn't like this */ struct btm const *restrict d2 /* or this */ ) { int rc; assert(d1 != NULL); assert(d2 != NULL); if ((rc = d1->year - d2->year)) return rc; if ((rc = d1->month - d2->month)) return rc; if ((rc = d1->day - d2->day)) return rc; if ((rc = d1->part - d2->part)) return rc; return 0; }
I don't care what MISRA says about it—it signals intent.
That these two pointer parameters to the same type are distinct objects and should not be the same object!
All you have to see is the function prototypes for the standard functions memcpy()
and memmove()
to see this.
memcpy()
is:
extern void *memcpy(void *restrict s1,void const *restrict s2,size_t n);
and thus,
the two memory regions aren't supposed to overlap; for memmove()
:
extern void *memmove(void *s1, void const *s2,size_t n);
the function states the memory regions can overlap. Intent. Geeze.
Of the rest,
I don't agree its arbitrary limit of 20 items in a union.
The union in question describes packet types of a custom protocol,
and I will not split it up just to satisfy some arbitrary limit in the scanning software.
Who came up with goto
labels being all upper case?
No, I don't agree with that as it clashes with over fourty years of C convention where purely uppercase is reserved for constants and macros.
I don't agree with the excessive casts it suggests,
and I don't agree with removing the one cast I do have.
I disagree about the “useless” parentheses around isdigit
because I'm signalling my itent to take the address of the function,
not the macro.
The C99 standard says this (from section 7.1.4):
Any function declared in a header may be additionally implemented as a function-like macro defined in the header, so if a library function is declared explicitly when its header is included, one of the techniques shown below can be used to ensure the declaration is not affected by such a macro. Any macro definition of a function can be suppressed locally by enclosing the name of the function in parentheses, because the name is then not followed by the left parenthesis that indicates expansion of a macro function name. For the same syntactic reason, it is permitted to take the address of a library function even if it is also defined as a macro.
(emphasis added). The code it's complaining about is:
if (!extract_token(tmp,sizeof(tmp),&p,(isdigit))) { /* ... */ }
The various is*()
functions are often defined as macros
(they are on the compilers we use).
I also dislik the “remove useless parentheses” crowd,
if only because the C precedence table is screwed up compared to most other languages.
I'm not going to refactor the code just because some scanner thinks the function is doing too much—it's not. Yes, the function itself might be long, but it's converting a rather complex structure from C to Lua (or Lua back to C). I'm not going to break it up just because. Besides, naming things is one of the two hard problems in Computer Science (the others being cache invalidation and off-by-one errors) and I would have to come up with some name for all the new one-use only functions.
But I'm not going to reject everything it said.
The suggestions such as reducing scope of variables or making some const
are fine.
And there were two bugs found,
but overall,
there was quite a bit of noise to go through.
Hopefully,
I can argue my case for the ones I disagree with.
We shall see.
Get thee behind me, Satan, part II
Woo hoo! I just received the shipping label to return Satan, the useless Windows Laptop!
Friday, February 25, 2022
An update on all the updates
I finally got Subversion installed on Belial, the annoying Mac Laptop, not via MacPorts but with Homebrew (suggested by several people, both at work and elsewhere). I did end up installing OpenSSL from source when I coudn't seem to install the headers for development, but then, I'm an old grey beard so I'm used to installing packages from source (I have both OpenSSL and LibreSSL installed on my home development system for example, but the less said about that, the better). But I get to keep the old Mac laptop around, just in case.
The new Mac laptop just doesn't quite feel like a Mac. I'm not sure what it is, but it just feels a bit … off, as a Mac. I'm not sure if it's the corporate big brother software installed on it, the gratuitous changes Apple has done to the UI, or the other subtle changes that have been done over the years, but it just doesn't have that … fun feeling it did back in 2005 when I started using Macs. I'm finding it hard to articulate why I feel this way. At least the updates don't require a reboot.
Satan, the useless Windows Laptop has been packaged up, the mailing label slapped on the box, and tomorrow it will be dropped off so that FedEx can send Satan back to whence it came.
Woot!
Monday, February 28, 2022
The process of starting at a company I've worked at for a decade
[I almost used the word “on-boarding” in the title, but then I realized that's marketspeak and I don't want to use marketspeak as I find the synergism incondusive to the synchronistic harmonizing of perpendicular incentives. Or something like that. I digress … ]
Now that the Corporate Overlords are in full Borg mode (“Resistance is futile. You will be assimilated.”) I just have to laugh at the process. Last Friday, I was sent instructions about signing up to the Corporate Overlords' Gitlab instance in email. The email consisted of “Here's the steps” followed by an image of the list of items to do, where the first item consisted of a few steps, the last of which was:
… and send an email to some-address-
followed by the text “thk.” So I wrote back saying the instructions appeared to be cut off partway through. Today I received a reply. The reply consisted of a Microsoft Word document, which included an image of the list of items to do, where the first item consisted of a few steps, the last of which was:
… and send an email to some-address-
followed by the text “thk.”
That is comedy gold right there.
Thursday, March 03, 2022
And in other news, old man shakes fist at clouds
Last week LibreSSL released a vew version of libtls
.
I'm not using LibreSSL's version of libtls
but libretls,
a port of LibreSSL's libtls
to OpenSSL
(got that? Good).
So I pull down the code so I can add the new features to my Lua TLS module when I notice the TLS_API
version hasn't been updated.
Again!
I swear,
what is up with the LibreSSL guys and not updating the TLS_API
value?
If they aren't going to update the value,
why even have it in the first place?
Tuesday, March 08, 2022
Is there any functional difference between Apple and Microsoft anymore?
At the Corporation,
we use syslog()
to log stuff.
I know it's not popular among the hipster crowd,
but honestly,
I don't find the modern replacements to be all that great,
nor as ubiquitous,
as syslog()
.
So I'm trying to configure Belial, the annoying Mac Laptop,
to handle logging from the various components and nothing I do seems to make any difference.
I have configured the running syslogd
on Belial to match that of the older Mac laptop
(and my Mac mini)
but nope.
No output from any of the components.
That's not to say nothing is being logged,
but it's only the programs from Apple that have any logging.
I'm not entirely sure what is going on.
In other Belial-related news,
I ran XCode while trying to troubleshoot the syslog()
issue.
It didn't help
(because Apple no longer allows root to change certain files without extreme measures,
but that's a rant for another time)
but rather more troubling,
XCode refused to quit running!
There was no way to quit the application,
and I couldn't shutdown the laptop because XCode was running.
I was only able to get it to stop running by forcibly removing power from the laptop.
Way to go, Apple! <slow clap>
Update later today …
I think I can answer that question: No difference at all. The hipsters have won and the old ways should be burned to the ground, and the ground salted.
Wednesday, March 09, 2022
Notes on syslog support from Mac OS 11.6.4
Yes,
I know,
Mac OS-X 11.6.4 is more than 20 minutes old.
Shut up!
Anyway,
a bit more about syslog()
support on Macs.
Our components
(like Project: Lumbergh)
compile on Macs.
We do initial testing on Macs.
Also,
our components use syslog()
.
And it's not like Mac OS-X has dropped syslog()
entirely—our code still compiles.
But syslog()
isn't quite working as I expect it to work.
When I run tests,
I monitor the logs in real time—I've configured both my Macs
(the older one, and Belial, the annoying Mac Laptop) to forward syslog messages to a central server,
which I can then monitor in real time based using my syslogd
replacement
(I should probably go into detail about how that works,
but that's beyond the scope of this entry).
Yes, I am seeing messages show up:
Mar 09 18:27:20 user notice 192.168.1.105 com.apple.xpc.launchd entering bootstrap mode Mar 09 18:27:20 user notice 192.168.1.105 com.apple.xpc.launchd exiting bootstrap mode Mar 09 18:27:20 user warn 192.168.1.105 com.apple.xpc.launchd Service exited with abnormal code: 254 Mar 09 18:27:26 daemon notice 192.168.1.105 aciseagentd Function: loadXMLCfgFile Thread Id: 0x2E77D40 File: ConfigData.cpp Line: 46 Level: warn :: ISEPostureCFG.xml not found, using defaults Mar 09 18:27:26 daemon notice 192.168.1.105 aciseagentd Function: GetConfigData Thread Id: 0x2E77D40 File: ConfigData.cpp Line: 220 Level: warn :: The cfg parameter for numeric value VlanDetectInterval was invalid. Using default. (XML was ) Mar 09 18:27:40 user notice 192.168.1.105 com.apple.xpc.launchd entering bootstrap mode Mar 09 18:27:40 user notice 192.168.1.105 com.apple.xpc.launchd exiting bootstrap mode Mar 09 18:27:40 user warn 192.168.1.105 com.apple.xpc.launchd Service exited with abnormal code: 254 Mar 09 18:27:46 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:27:50 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:27:50 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:27:50 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:27:50 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:27:50 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:27:50 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:27:50 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:27:50 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:27:50 user notice 192.168.1.105 com.apple.xpc.launchd Service exited due to SIGKILL | sent by mds[316] Mar 09 18:28:00 user notice 192.168.1.105 com.apple.xpc.launchd entering bootstrap mode Mar 09 18:28:00 user notice 192.168.1.105 com.apple.xpc.launchd exiting bootstrap mode Mar 09 18:28:00 user warn 192.168.1.105 com.apple.xpc.launchd Service exited with abnormal code: 254 Mar 09 18:28:20 user notice 192.168.1.105 com.apple.xpc.launchd entering bootstr
So the syslogd
forwarding is working
(although I'm not sure which service exited due to SIGKILL
since that information isn't logged,
but whatever,
I'm getting logs forwarded by syslogd
on Belial).
But when I run our stuff?
Nothing comes through.
This code comples and runs:
#include <stdio.h> #include <syslog.h> int main(void) { for (int pri = 0 ; pri < 8 ; pri++) syslog(pri,"This is a test %d",pri); return 0; }
But I'm not seeing the logs being forwarded.
And even when I edited /etc/syslog.conf
to read:
# Note that flat file logs are now configured in /etc/asl.conf install.* @127.0.0.1:32376 *.* @192.168.1.10 *.* /tmp/log-all-the-things.txt
The /tmp/log-all-the-things.txt
file wasn't even created!
There are messages being forwarded to 192.168.1.10,
but aside from that,
it's as if everything else in this file is being ignored.
After some searching,
I did find about about the log
program.
I ran log stream –process syslogt
in one window,
then my test program syslogt
in another,
and behold:
[sconner]belial:~>log stream --process syslogt Filtering the log data using "process BEGINSWITH[cd] "syslogt"" Timestamp Thread Type Activity PID TTL 2022-03-09 18:28:16.110052-0500 0x3513e Default 0x0 19313 0 syslogt: This is a test 0 2022-03-09 18:28:16.110914-0500 0x3513e Default 0x0 19313 0 syslogt: This is a test 1 2022-03-09 18:28:16.110943-0500 0x3513e Default 0x0 19313 0 syslogt: This is a test 2 2022-03-09 18:28:16.110965-0500 0x3513e Default 0x0 19313 0 syslogt: This is a test 3 2022-03-09 18:28:16.110986-0500 0x3513e Default 0x0 19313 0 syslogt: This is a test 4 2022-03-09 18:28:16.111005-0500 0x3513e Default 0x0 19313 0 syslogt: This is a test 5
Logs!
Only … not all of them.
syslog()
supports eight levels of logging,
yet this only shows six.
The final two,
levels LOG_INFO
and LOG_DEBUG
aren't logged!
Even editing the /etc/asl.conf
file to read:
# save everything from emergency tonoticeDEBUG ? [<= Level debug] store
Doesn't help.
Levels LOG_INFO
and LOG_DEBUG
are simply dropped.
And guess what level most of our logs are at?
XXXX you, Apple!
Tuesday, April 12, 2022
Github shenanigans
It all started with a simple pull request to fix a bug. I have never attempted to just “merge” a pull request on Github before, but I figured, with such a simple change, why not try? Why not indeed.
Well, it broke my local repository. The commit message wasn't what I would have liked, and I felt a revision of the version number was required, which also involved updating the makefile and the Luarocks specification file. I made the mistake (I think—I don't know) of amending the merge message with a reformatted title and extra files and that was that. I was unable to push the changes back to Github.
I ended up having to reset both my local repositories and the Github repository.
Hard.
As in with the git reset --hard
nuclear option.
And hand added the patch into the code,
redid all the changes to the makefile and Luarocks specification file multiple times.
Ugly stuff.
But I got it as I like it.
And then I went to load the new version of the code into Luarocks and of course it failed.
Of course.
Github decided several months ago to depcrecate support for git:
URLs and guess what I'm using?
Sigh.
It took longer than I liked to find out I need to switch to using git+https:
URLs,
and several version bumps of several of my Lua modules to get it all straightened out.
I just cannot update the Luarocks specification files properly.
It always takes way too many tries for me to get it right.
Aaaaaah!
I'm also unsure why the Github merge failed for me. Am I not using the “proper” work flow? Is it because Github considers itself the “primary repository” when in fact, for my stuff, it isn't? I don't know. Perhaps I'm slowly becoming computer illiterate.
Update on Wednesday, April 13th, 2022
It seems the latest version of Luarocks will auto-correct
git:
URLs.
[See what you get when you don't update every 20
minutes? —Editor] [Shut up, you!
—Sean] I'm not sure what to think of this.
Wednesday, April 13, 2022
It's been long gone, like fifteen years long gone. Why are you still asking?
About a month ago, I was checking my webserver logs when I noticed mutiple requests to pages that have long been marked as gone. The webserver has been returning HTTP status code 410 Gone, in some cases, for over fifteen years! At first I was annoyed—why are these webbots still requesting pages I've marked as gone? But then I started thinking about it—if I were writing a webbot to scan web pages, what would I do if I got a “gone” status? Well, I'd delete any references to said page, for sure. But when what if I came across the link on another page? I don't have the link (because I deleted it earlier) so let's add it to scan. Lather, rinse, repeat.
So there's a page or pages out there that are still linking to the pages that are long gone. And upon further investigation, I found the pages—my own site!
Sigh.
I've fixed some of the links—mostly the ones that have been causing some real issues with Gemini requests, but I still have scores of links to fix in the blog.
I also noticed a large number of permanent redirects, and again, the cause are pages on my own site linking to the non-canonical source. This isn't that much of an issue for HTTP (because the HTTP connection is still open for further requests) but it is one for Gemini (because each request is a separate connection to the server). I started fixing them, but when I did a full scan of the site (and it's mostly links on my blog) there are a significant number of links to fix—around 500 or so. And mostly in the first five years of entries.
Just an observation, nothing more
As I've been cleaning up blog entries (the first few years also have some pretty bad formatting issues) I've noticed something—I used to write a lot more back then. I think part of that is that the whole “blogging” thing was new, and after twenty-plus years, I've covered quite a bit of material. There have been multiple instances where I come across somthing, think I should blog about that, and when I check to see if I have, indeed blogged about that, I already have. Also, it's a bit more of a pain these days as I manually add links to my blogs at MeLinkedInstaMyFaceInGramSpaceBookWe. This used to be automated in the past, but InstaMyFaceMeLinkedWeInGramSpaceBook doesn't play well with others and with constant API updates and walled garden policies, it's sadly, easier to manually update links than it is to automate it (also this chart). I mean, I don't have to update links at MyFaceMeLinkedInstaInGramSpaceBookWe, but pretty much everybody just reads MeLinkedInstaMyFaceWeInGramSpaceBook (Web? What's that?) which is why I bother at all.
Can someone explain to me why this happens?
I don't understand.
It's not just the MJ12Bot that can't parse links. It seems many web robots can't parse links correctly. Last month there were requests like:
/%5C%22gemini://gemini.ctrl-c.club/~stack/gemlog/2022-02-16.tls.gmi%5C%22
/%5C%22/2022/02/23.1%5C%22
/%5C%22https://news.ycombinator.com/item?id=30091336%5C%22
/%5C%22http://thecodelesscode.com/case/219%5C%22
I mean,
the request /2022/02/23.1
does exist on my server,
but not (decoded) /\"/2022/02/23.1\"
.
What?
That's worse than what MJ12Bot was sending back in the day.
And it's not like it's a single web robot making these requests—no! It's three different web robots!
I just … what?
Friday, April 15, 2022
There are only two certainties in life; this post is about one of them, and it's not death
Last night I was busy with cleaning up my past blog entries when I happened to notice the date—April 15th. And then I suddenly remembers—I need to do my taxes! Well, as I found out later, I have until the 18th to file, but I always feel better if I'm able to get them filed by … well, technically, today.
Fortunately, my taxes aren't that hard and it only took about half an hour for me to fill out the 1040 form (I'm not a fan of electronic filing, but that's only because I know how the electronic sausage is made), take the form to the post office and have it hand cancelled by the post master (okay, that last bit was done by Bunny since she was out running errands at the time).
Saturday, April 16, 2022
My common Gemini crawler pitfalls
Just like the common web, crawlers on Gemini can run into similar pitfalls. Though the impact is much lower. The gemtext format is much smaller than HTML. And since Gemini does not support reusing the TCP connection. It takes much longer to mass-crawl a single capsule. Likely I can catch some issues when I see the crawler is still running late night. Anyways, this is a list of issues I have seen.
Common Gemini crawler pitfalls
Martin Chang has some views of crawlers from the crawler's perspective, but I still have some views of crawlers from the receiving end that Martin doesn't cover. I finally got fed up with Gemini crawlers not bothering to limit their following of redirects that I removed not only that particular client test from my site, but the entire client test from my site. Martin does mention a “capsule linter” to check for “infinite extending links,” but that's not an issue a site author should fix just to apease the crawler authors! It's an actual thing that can happen on the Inernet. A crawler must deal with such situations.
Another issue I'm seeing with crawlers is an inability to deal with
relative links. I'm seeing requests like
gemini://gemini.conman.org/boston/2008/04/30/2008/04/30.1
or
gemini://gemini.conman.org//boston/2015/07/02.3
. The former I can't wrap my brain around how it got
that link (and every request comes from the same IP address—23.88.52.182), while the second one seems like a
simple bug to fix (generated by only three different clients—202.61.246.155,
116.202.128.144, 198.50.210.248).
The next issue are the ever present empty requests. People, Gemini is not gopher—empty requests are not allowed. I'm even returning an invalid error status code for this, in the vain hope the people running the crawlers (or clients) would notice the invalid status code. I wonder what might happen if I return a gopher error in this case? I mean, I modified my gopher server to return an HTTP error response when it received an HTTP request, so I think I can justify returning a gopher error from my Gemini server when it receives a gopher request. These types of requests have come from nine different crawlers, and this list includes the ones with the parsing issues.
Continuing on, there have been requests for domains I'm not running a
Gemini server on. I'm not running a Gemini server on conman.org
,
www.conman.org
, nor boston.conman.org
. The domain
is gemini.conman.org
.
Another inexplicable request I'm seeing are a bunch of requests of the
form gemini://gemini.conman.org/bible/genesis.41:1-57
, which are
all coming from the same place—202.61.246.155 (this one seems to be a
particularly bad crawler). What's weird about it is that the request should
be gemini://gemini.conman.org/bible/Genesis.41:1-57
(note the
upper case “G” in “Genesis”). The links on the site are
properly cased, so this shouldn't be an issue—is the crawler attempting to
canonicalize links to lower case? That's not right. And by doing this, this
particular crawler is just generating spurious requests (the server will
redirect to the proper location).
So yes, those are my common Gemini crawler pitfalls.
Update on Friday, April 22nd, 2022
I have managed to wrap my brain around how it got that link.
Update on Sunday, May 1st, 2022
And yes, the “double slash” bug was a simple, but …
Wednesday, April 20, 2022
The day I met the creator of Garfield
When I saw the date today, I remembered that on this day back in 1981, I met Jim Davis, creator of Garfield. And of course I've already written about this. But thinking about it, I'm not sure what to make of this—that it's been 22 years since I last wrote about it, which is longer than the time between the actual event and writing about it the first time (19 years). It's also sobering to think it's been 41 years since I met him, and I still remember it like it happened yesterday.
Sigh.
Thursday, April 21, 2022
Notes on obtaining a process ID from Java on Mac OS Big Sur
There's a tool I use at work to manually test the code we work on. It's a graphical tool written in Java, and it allows us to configure the data for a test. It then runs another process that starts everything up, and then runs another tool to inject a SIP message into the code being tested. It then opens the output file from the SIP injection tool to display the results. This tool doesn't work quite right on Belial, the annoying Mac Laptop.
Problem one—the output file has, as part of its name, the process ID of the SIP injection tool. And it seems there is no easy way to get said process ID from within Java. The other issue, also related to process IDs, is it attempts to stop the process that starts everything up. That fails, because again, there is no easy way to get a process ID from within Java.
There is code that attempts to get a process ID:
process = pb.start(); if (process.getClass().getName().equals("java.lang.UNIXProcess")) { try { Field f = process.getClass().getDeclaredField("pid"); f.setAccessible(true); sippPid = f.getInt(process); } catch (Throwable e) { } }
This horrible bit of code does work under Linux,
and it works on older versions of Mac OS-X.
But Belial is a more modern version of Mac OS-X,
on the new Apple M1 architecture.
Here,
sippPid
is always 0.
Sigh.
Notes on fixing a Java issue on Mac OS Big Sur
When last we met, I was left with a broken test tool on the newer Mac laptops. The issue at hand is that it's problematic to obtain process IDs in Java, which the testing tool needs for two things. The first is an output file. It turns out one can specify the output file the SIP injection tool generates instead of the default one which uses a process ID. This also makes it easier to check the output since you don't have to grovel through the directory for an ever-changing file name. That issue fixed.
The second one—how to stop the program that runs all the programs that are being tested.
The code used the process ID to terminate that program by shelling out to run kill -SIGINT pid
.
It turns out the Java Process
object does have a destroy()
method
(it sends a SIGTERM
to a process, which is fine).
It was just a simple matter to update the code to use the destroy()
method to terminate the program rather than trying to obtain the process ID in a dodgy way.
That issue fixed.
Now all I have to do is spend a few weeks trying to get the code commited to the repository (yeah, I'm still trying to get used to the process—sigh).
Friday, April 22, 2022
Notes on some extreme lawn ornaments, Brevard edition
Eight years ago (wow! Has it been that long? [Yes. —Editor] [Who asked you? —Sean]) while in Brevard, I took a picture of some extreme lawn ornaments—life sized plastic cows. I wrote the “eat moar chikin” image caption (if you hold your mouse over the image, it should pop up) because the cows reminded me of the cows used by Chick-fil-a.
I'm reading the Transylvania Times when I come across the article “Transylvanian of the Week: John Taylor.” He owns O.P. Taylor's, a well known toy store in the area, and he's the one with the life sized plastic cows in his front yard. Not only that, but he purchased them from the person who made them for Chick-fil-a. Little did I know that my caption was more correct than I thought.
Play stupid games, win stupid prizes
It's not only Gemini bots having issues with redirects. I'm poking around the logs from my webserver, when I scan all of them to see the breakdown of response codes my server is sending (for this month). And well … it's rather surprising:
Status | Meaning | Count |
---|---|---|
Status | Meaning | Count |
302 | Found (moved temporarily) | 253773 |
200 | OK | 178414 |
304 | Not Modified | 25552 |
404 | Not Found | 8214 |
301 | Moved Permanently | 6358 |
405 | Method Not Allowed | 1453 |
410 | Gone | 685 |
400 | Bad Request | 255 |
206 | Partial Content | 151 |
401 | Unauthorized | 48 |
500 | Internal Server Error | 24 |
403 | Forbidden | 4 |
I was not expecting that many temporary redirects.
Was it some massive issue across all the sites?
Or just a few?
Well,
it turned all of the temporary redirects were from one site: http://www.flummux.org/
(and no,
I'm not linking to it as the reason why will become clear).
I registered the domain way back in 2000 just as a place to play around with web stuff or to temporarly make files available without cluttering up my main websites.
The site isn't meant to be at all serious.
Scanning the log file manually, I was seeing endless log entries like:
XXXXXXXXXXXXXXX - - [10/Apr/2022:20:55:05 -0400] "GET / HTTP/1.0" 302 284 "http://flummux.org/" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; MRA 4.6 (build 01425); .NET CLR 1.0.3705; .NET CLR 2.0.50727)" -/- (-%)
That log entry indicates a “browser” from IP address XXXXXXXXXXXXXXX,
identifying itself as “Mozilla (yada yada)” on the 10th of April,
attempted to get the main page,
as referred by http://flummux.org/
.
And for how many times this happened,
broken down by browser:
Count | User agent |
---|---|
127100 | Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; MRA 4.6 (build 01425); .NET CLR 1.0.3705; .NET CLR 2.0.50727) |
126495 | Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET4.0C; .NET4.0E) |
42 | Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36 |
36 | CATExplorador/1.0beta (sistemes at domini dot cat; https://domini.cat/catexplorador/) |
15 | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:94.0) Gecko/20100101 Firefox/94.0 |
Ah, two “browsers” that don't limit the number of redirects they follow. And amusingly enough, both agents came from the same IP address. Or maybe it's the same agent, just lying about what it is. Who knows? Well, aside from the author(s) of said “browser.”
But what was all horribly confusing to me why the server was issuing a temporary redirect.
Yes,
if you try to go to http://flummux.org/
the server will repond with a permanent redirect (status 301) to http://www.flummux.org/
(the reasons for that is to canonicalize the URLs and avoid the “duplicate content penalty” from Google—I set this all up years ago).
But the site shouldn't redirect again.
I can bring the site up in my browser without issue
(which is a visual … pun? Commentary? Joke? on the line “The sky above the port was the color of television, tuned to a dead channel.”).
And then I remembered—back in 2016,
I set things up such that if the browser sent in a referring link,
the page would temporarily redirect back to the referring link
(which is why I'm not linking to it—you would just be redirected right back to this page).
I set that up on a lark for some reason that now esacapes me.
So the above “browsers” kept bouncing back and forth between flummux.org
and www.flummux.org
.
For a quarter of a million requests.
Sigh.
In other news, bugs are nothing more than an inattention to detail.
I have now wrapped my brain around how it got that link
Martin Chang replied to my post about Gemini crawlers,
saying that it was his crawler that had sent links like gemini://gemini.conman.org/boston/2008/04/30/2008/04/30.1
and decided to look into the issue.
Well,
he did,
and he found it wasn't his issue,
but mine.
Oh my.
Okay,
so how did I end up generating links like gemini://gemini.conman.org/boston/2008/04/30/2008/04/30.1
?
This is, first and foremost, a blog on the web. Each entry is stored as HTML, and when a request is made via gopher or Gemini, the entries making up the request are retrieved and converted to the appropriate format. As part of that conversion, links to the blog itself have to be translated appropriately, and that's where the error happened.
So, for example, the links for the above entry are collected:
http://www.cisco.com/
http://it.slashdot.org/article.pl?sid=08/04/29/2254242
http://www.arin.net/
2008/04/30.1#fn-2008-04-30-1-1
http://www.barracudanetworks.com/
http://answers.yahoo.com/question/index?qid=20080219010714AAnF91Q
Those links with a URL scheme are passed through as is,
but #4 is special,
not only is it a relative link to my blog,
but it also contains a URL fragment,
and that's where things went pear-shaped.
The code to do the URL translations parsed each link as a URL,
but for relative links,
I used the string,
not the parsed URL structure.
As such,
the code didn't work so well with URL fragments,
and thus,
I ended up with links like gemini://gemini.conman.org/boston/2008/04/30/2008/04/30.1
(for the record,
the same bug was in the gopher translation code as well).
The fix, as for most bugs, was easy once the core issue was identified. The other issues I talked about are, as far as I can tell, not stuff I can fix.
Saturday, April 23, 2022
Does that mean I know have to unit test my text-only websites?
I fixed the infinite redirections from Hell bug.
And again,
like most bugs,
it was an easy fix—just don't redirect if you come from http://flummux.org/
.
It feels weird to think of having to test a text-only website,
but there is a form of programming involved,
so it shouldn't be as much of a surprise as it is.
Sigh.
“We're a local newspaper run by a non-local company, we don't care about European readers”
I was reading Conman's latest article, and he linked to a page called «Transilvania Times». I wanted to see it, but for the first time since the vote of the GPDR my visit was denied because I'm European.
gemini://station.martinrue.com/adou/f3868913db6e409eae9fa67845f70324
The “GPDR” is a typo—the author actually meant the GDPR. And it pains me to see something like this happen. Here's someone from Europe who was interested in reading a story about a person in a small US town and yet, they couldn't because the owners of the news website (which isn't owned locally, but instead by a larger company in another state) probably doesn't care about European readers. The company does have a policy for California readers, so I don't see why it can't be extended for the GDPR. This is just so short sighted.
Saturday, April 30, 2022
Musings on processing malformed Gemini (and web) requests
I'm still bothered with Gemini requests like gemini://gemini.conman.org//boston/2015/10/17.2
.
I thought it might be a simple bug but now I'm not so sure.
There's a client out there that has made 1,070 such requests,
and if that was all,
or even most,
of the requests,
then yes,
that's probably a simple bug.
But it's not.
It turns out to be only 4% of the requests from said client are malformed in that way.
Which to me indicates that something out there might be generating such links
(and for this case,
I checked and I don't think I'm the cause this time).
I decided to see what happens on the web. I poked a few web sites with similar “double slash” requests and I got mixed results. Most of the sites just accepted them as is and served up a page. The only site that seemed to have issues with it was Hacker News, and I'm not sure what status it returned since it's difficult to obtain the status codes from browsers.
So, I have a few options.
- I can keep the current code and always reject such requests. In my mind, such requests have no meaning and are malformed, so why shouldn't I just reject them?
- I can send a permanent redirection to the “proper” location. This has the upside of maintaining a canonical link to each page, but with the downside of forcing clients through an additional request, and me having to live with the redundant requests in the log files. But it's obvious what resource is being requested, and sending a permenent redirect informs the client of the proper location.
- I can just silently clean up the request and carry on.
The upside—clean logs with only one request.
The downside—two (or more) valid locations for content.
On the one hand,
this just feels wrong to me,
as technically speaking,
/foo
and//foo
should be different resources (as per Uniform Resource Identifier: Generic Syntax,/foo
and/foo/
are technically different resources, so why not this case?). On the other hand, this issue is generally ignored by most web servers out there anyway, so there's that precendent. On the gripping hand, doing this just seems like a cop out and blindly following what the web does.
Well, how do current Gemini servers deal with it? Pretty much like existing web servers—most just treat multiple slashses as a single slash. I think my server is the outlier here. Now the question is—how pedantic do I want to be? Is “good enough” better then “perfect?”
Perhaps a better question is—why am I worrying about this anyway?
Sunday, May 01, 2022
A zombie site from May Days past
Given that today is May Day I was curious as to what I wrote on past May Days.
And lo'
sixteen years ago I wrote about OsiXs.org
and their attempt to “change the world!”
Amazingly,
the website is still around,
although with even less than there was sixteen years ago.
I guess I was right when I wrote back then,
“I personally don't see this going anywhere fast.”
It was a simple bug, but …
I was right about the double slash bug—it was a simple bug after all. The authors of two Gemini crawlers wrote in about the double slash bug, and from them, I was able to get the root cause of the problem—my blog on Gemini. Good thing I hedged my statement about not being the cause yesterday. Sigh.
Back in Debtember, I added support for displaying multiple posts. It's not an easy feature to describe, but basically, it allows one to (by hacking the URL, but who hacks URLs these days?) specify posts via a range of dates. And it's on these pages that the double slashed URLs appear. Why that happens is easy—I was generating the links directly from strings:
local function geminilink(entry) return string.format("gemini://%s%s/%s%04d/%02d/%02d.%d", config.url.host, port, -- generated elsewhere config.url.path, entry.when.year, entry.when.month, entry.when.day, entry.when.part ) end
instead of from a URL type.
I think when I wrote the above code,
I wasn't thinking in terms of a URL type,
but of constructing a URL from data I already had.
The bug itself is due to config.url.path
ending in a slash,
so the third slash in the string literal wasn't needed.
The correct way isn't that hard:
local function geminilink(entry) return uurl.toa(uurl.merge(config.url, { path = string.format("%04d/%02d/%02d.%d", entry.when.year, entry.when.month, entry.when.day, entry.when.part) })) end
and it wouldn't have exhibited the issue.
With this fix in place, I think I will continue to reject requests with the double slash, as it is catching bugs, which is a Good Thing™.
Monday, May 02, 2022
Notes on an overheard conversation about tea
“You know, you forgot to remind me to make your tea.”
“Oh. I need to remind you make tea.”
“Sigh.”
“So thank you for reminding me to remind you to make tea.”
“…”
“Um, doesn't hitting your head against the wall hurt?”
Tuesday, May 03, 2022
I'm hoping this is a joke, because if it's not, I'm not sure what that says about our society
I finished my lunch of a sub sandwich when I notice a message printed on the wrapper in not-so-small print:
I have no words.
The legality of double slashes in URIs
Martin Chang replied to my musings on processing malformed Gemini requests, saying that double slashes in URIs are illegal, and pointed out the ABNF grammar from the URI specification to back up his claim:
path = path-absolute ; begins with "/" but not "//" path-absolute = "/" [ segment-nz *( "/" segment ) ] segment-nz = 1*pchar pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
But he didn't quote the segment
rule:
segment = *pchar
which translated says,
“0 or more pchar
rules.”
So the ABNF he quoted does indeed rule out //boston/2018/07/04.2
.
It doesn't rule out /boston//2018/07/04.2
,
since by the time we hit the double slash,
we're in the *( "/" segment )
part of the path-absolute
rule,
and segment
can have 0 characters.
But what he quoted only applies to relative links,
what I receive is an abolute link.
If you follow the ABNF from that perspective:
URI-reference = URI / relative-ref URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] hier-part = "//" authority path-abempty / path-absolute / path-rootless / path-empty path-abempty = *( "/" segment ) ; other rules omitted
not only does this allow gemini://gemini.conman.org//boston/2018/07/04.2
but gemini://gemini.conman.org///////////boston/2018/07/04.2
.
I can understand why this was done—to simplify the grammar as the various path-
rules generally end with *( "/" segment )
allows one to end a URI with a trailing slash or not.
I don't think the intent was to allow long strings of slashes,
but that's the end result of a lax grammar.
Martin is also correct that multiple slashes are treated as a single slash on POSIX
(basically,
any Unix system),
that's not the case across all operating systems.
One exception I can think of AmigaOS,
where each slash represents a parent directory.
This command, cd ///
on AmigaOS is the same as cd ‥/‥/‥
on a POSIX system.
Crazy,
I know.
And maybe not even relevant these days,
but I thought I should mention it.
Wednesday, May 04, 2022
Star Wars Day?
It's not Star Wars Day—it's Dave Brubeck Day! (and give yourself 10 cool points if you get the reference) Of course, it's only Dave Brubeck day in the US. Elsewhere in the world, Dave Brubeck Day is April 5th for some odd reason (give yourself a geek point for getting this reference).
[And of course Sean didn't tell you he pulled this meme from FaceMeLinkedInstaMySpaceBookWeInGram. He's not that cool to think of this. —Editor]
Tuesday, May 10, 2022
Springfield isn't the most popular city name in the US
OK, why do the Simpsons live in a town called Springfield? Isn't that a little generic?
Springfield was named after Springfield, Oregon. The only reason is that when I was a kid, the TV show “Father Knows Best” took place in the town of Springfield, and I was thrilled because I imagined that it was the town next to Portland, my hometown. When I grew up, I realized it was just a fictitious name. I also figured out that Springfield was one of the most common names for a city in the U.S. In anticipation of the success of the show, I thought, “This will be cool; everyone will think it's their Springfield.” And they do.
Matt Groening Reveals the Location of the Real Springfield | Arts & Culture| Smithsonian Magazine
So I got to wondering, is Springfield the most popular city name in the US? I know, weird question, but I'm curious. So some quick searching lead me to the United States Geological Survey Geographical Names Database. With some massaging of the data, I was able to determine that there are 34 States with a “Springfield,” but it's not alone. There are eight other cities that are also in 34 States: Arlington, Chester, Clinton, Farmington, Florence, Greenville, Milton, and Newport. Okay, maybe not the same 34 states across all those cities, but you get the idea.
But those cities aren't the most popular names. No, all of them are tied for ninth place! The city name that appears in most states is “Riverside” at 46 States (plus Puerto Rico). The States that don't have a “Riverside” are Alaska, Hawaii, Oklahoma, and Louisiana (really? Louisiana? One of the world's largest river run straight through that state, and no one bothered to name a town in Louisiana, “Riverside?”).
And just to satisfy the curious:
Place | Name | # States |
---|---|---|
1 | Riverside | 47 |
2 | Centerville | 43 |
3 | Fairview | 41 |
4 | Franklin | 40 |
5 | Midway | 39 |
6 | Georgetown | 37 |
Glendale | 37 | |
Greenwood | 37 | |
7 | Lincoln | 36 |
Marion | 36 | |
Oakland | 36 | |
Pleasant Valley | 36 | |
Salem | 36 | |
Union | 36 | |
8 | Fairfield | 35 |
Lakeview | 35 | |
Liberty | 35 | |
9 | Arlington | 34 |
Chester | 34 | |
Clinton | 34 | |
Farmington | 34 | |
Florence | 34 | |
Greenville | 34 | |
Milton | 34 | |
Newport | 34 | |
Springfield | 34 | |
10 | Bethel | 33 |
Clifton | 33 | |
Eden | 33 | |
Glenwood | 33 | |
Hamilton | 33 | |
Kingston | 33 | |
Lakeside | 33 | |
Mount Pleasant | 33 | |
Summit | 33 |
Thursday, May 12, 2022
“This is how we do things around here.”
And, in fact, anyone with any proximity to software development has likely heard rumblings about Agile. For all the promise of the manifesto, one starts to get the sense when talking to people who work in technology that laboring under Agile may not be the liberatory experience it’s billed as. Indeed, software development is in crisis again—but, this time, it’s an Agile crisis. On the web, everyone from regular developers to some of the original manifesto authors is raising concerns about Agile practices. They talk about the “Agile-industrial complex,” the network of consultants, speakers, and coaches who charge large fees to fine-tune Agile processes. And almost everyone complains that Agile has taken a wrong turn: somewhere in the last two decades, Agile has veered from the original manifesto’s vision, becoming something more restrictive, taxing, and stressful than it was meant to be.
Part of the issue is Agile’s flexibility. Jan Wischweh, a freelance developer, calls this the “no true Scotsman” problem. Any Agile practice someone doesn’t like is not Agile at all, it inevitably turns out. The construction of the manifesto makes this almost inescapable: because the manifesto doesn’t prescribe any specific activities, one must gauge the spirit of the methods in place, which all depends on the person experiencing them. Because it insists on its status as a “mindset,” not a methodology, Agile seems destined to take on some of the characteristics of any organization that adopts it. And it is remarkably immune to criticism, since it can’t be reduced to a specific set of methods. “If you do one thing wrong and it’s not working for you, people will assume it’s because you’re doing it wrong,” one product manager told me. “Not because there’s anything wrong with the framework.”
Via Hacker News, Agile and the Long Crisis of Software
That last line, “it's not working for you, people will assume it's because
you're doing it wrong,” rings really true to me. At The
Corporation—no, I no longer work for The Corporation, I now work for
The Enterprise now that the Corporate Overlords have finally taken over. So,
at The Enterprise, I've been informing them pretty much all this year that
this “Agile” development system they're forcing on us isn't working. Before
they finally took over, the team I
was on was always on time, on budget, smooth deployments (only two bad
deployments in ten years) and no show-stopping bugs found in production. As I
told upper management, given our prior track record, why change how
we do development? Why fix what isn't broken? And while upper management
never said this directly, through their actions they answered: this is our
process, and we're sticking to it, slipped schedules and disasterous
deployments be damned!
As to why I haven't left yet? Because it seems this “Agile” movement has invaded everywhere and things would be “more of the same” elsewhere. At least here, I'm not forced to use Windows.
Programming, up hill, both ways
People would come to us with a problem, and we would figure out a solution. We couldn't just search the web because the web was still being written. And you couldn't just punt a hard question to the engineer in the desk next to you. Why? Because you were sitting alone in a utility closet packed with floppy disks and old tape drives.
Ah, this takes me back. I got my first computer back in 1984, and if I wanted to know anything about it I was on my own. Google didn't exist (the public Internet didn't exist at the time). I didn't have anyone I could ask about computer related things. I did have books and magazines. So between experimentation and learning to read between the lines, I picked up programming.
So when it came time to write a metasearch engine, there were no tutorials. There were no open source metasearch engines to download and use. There was only the problem of writing a metasearch engine, in a language I didn't even know (and which itself was less than a year old at the time).
Fun times.
So I always found it odd when people would go online asking for tutorials, especially for writing metasearch engines (and yes, that did happen back then). So when something like testing a negative comes up, and I can't convince the Powers That Be that it's never a good idea to prove a negative, I can't just look up some tutorial on proving negatives—I just have to figure it out on my own.
Friday, May 20, 2022
If you have to embrace the stupid, you might as well do it well
Our customer, The Oligarchic Cell Phone Company, wants us to do a demo of a new feature for a certain class of clients. “Project: Lumbergh” will receive a URL along with the name and reputation of a phone number it gets from elsewhere. “Project: Lumbergh” will then pass this along to “Project: Sippy-Cup.” We already have to deal with URLs from elsewhere. The only change we have to make is allowing URLs to be passed along to the certain class of clients, which formerly did not get URLs. So far, so good.
But then I saw code being added to “Project: Lumbergh” to check the URLs to see if the path portion ended in .bmp
.
I enquired about this,
because to me,
that makes no sense—we're just a conduit for data;
the source of the URL should already know what it can and can't send to the client.
I was told that the certain class of clients only support BMP files while other clients that can receive URLs can't support BMP files,
so we have to ensure that BMPs only go the subset of clients that can support them.
I countered with the fact that we include information about the client to the data source when we query them,
and they should have the logic to handle this on their end—why are we suddenly reponsible for this?
I was told that the LOF for the data source would be too large to handle by the demo deadline, that we had to handle it,
that the code that just looks anywhere in the URL for a literal “.bmp” is Good Enough™,
and to stop with the questions.
Now the URL we're given is “percent-encoded”—we get something like: https%3A%2F%2Fexample.com%2Fpicture.bmp
.
Nevermind the fact that that is an invalid URL to begin with
(you aren't supposed to encode characters that are defined as delimiters in URLs if they are,
in fact,
delimiting fields),
that's what we get and pass along.
Only now
(a few years after we started passing URLs along like this)
the clients can't properly decode them
(surprise!),
so of course we have to do that.
I asked why we even had to do that and was told that the LOF for the data source would be too large to handle by the demo deadline, we had to handle it,
and to stop with the questions.
I then complained about the code doing that was doing too much,
as it would decode the so-called “unsafe characters” from RFC-3986
(which aren't defined in the RFC,
but can be derived by a careful reading between the lines),
like the dreaded space character.
There was then much back and forth between me and my manager
(it's not who I thought it was but that's another rant for another time)
about what should and shouldn't be decoded.
I kept saying that if we have to embrace the stupid,
we might as well do it right,
but my manager was arguing against doing that and we should just decode %3A
and %2F
since that's all that's being asked of us today.
I countered with “What about tomorrow,
when we're asked to decode %3F
(‘?’) and %40
(‘@’)?”
(which are delimiter characters per RFC-3986)
I was told to stop with the questions.
And then all hell breaks loose when we get https%3A%2F%2Fexample.com%2FThings%2520Go%2520Boom%2521
.
Sigh.
Wednesday, May 25, 2022
URI encoding
I've fallen into a rabbit hole of URI encoding and decoding, and why not publish my results here so I at least have a place I know where I can look it up again. And who knows? Maybe someone else will find this useful.
Anyway, there are two standards that define URIs:
The first is from the IETF and what most non-browsers that deal with URIs use. The second is from the WHATWG (and while WHATWG stands for “Web Hypertext Application Technology Working Group,” I always read that as ”What Working Group?” which gives away my opinions on this group, truth be told) and is the standard being pushed by the three major browsers left (Chrome, Firefox and Safari).
RFC-3986 is quite clear on when to encode and decode characters:
Under normal circumstances, the only time when octets within a URI are percent-encoded is during the process of producing the URI from its component parts. This is when an implementation determines which of the reserved characters are to be used as subcomponent delimiters and which can be safely used as data. Once produced, a URI is always in its percent-encoded form.
When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters. The only exception is for percent-encoded octets corresponding to characters in the unreserved set, which can be decoded at any time. For example, the octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by "~" without changing its interpretation.
Because the percent ("%") character serves as the indicator for percent-encoded octets, it must be percent-encoded as "%25" for that octet to be used as data within a URI. Implementations must not percent-encode or decode the same string more than once, as decoding an already decoded string might lead to misinterpreting a percent data octet as the beginning of a percent-encoding, or vice versa in the case of percent-encoding an already percent-encoded string.
RFC-3986, section 2.4: When to Encode or Decode
But you do have to read the ABNF carefully to find the 10 characters not mentioned that must be encoded. The WHATWG standard isn't easy to follow as it describes in all-too-verbose English the algorithm of how to encode and decode a URI, but it does cover what to encode and what not to encode. As I went through both stardards and several other sources (links below), I've created the following table of what characters to encode (current as of this date), with a preference for RFC-3986 (but with notes where WHATWG diverges from RFC-3986):
scheme | auth | path | query | fragment | note | ||
---|---|---|---|---|---|---|---|
scheme | auth | path | query | fragment | note | ||
SPACE | - | Y | Y | Y | Y | ||
! | sub-delim | - | m | m | m | m | |
" | - | Y | Y | Y | Y | ||
# | gen-delim | - | m | m | m | m | 4 |
$ | sub-delim | - | m | m | m | m | |
% | escape | - | Y | Y | Y | Y | |
& | sub-delim | - | m | m | m | m | |
' | sub-delim | - | m | m | m | m | |
( | sub-delim | - | m | m | m | m | |
) | sub-delim | - | m | m | m | m | |
* | sub-delim | - | m | m | m | m | |
+ | sub-delim | N | m | m | m | m | |
, | sub-delim | - | m | m | m | m | |
- | unreserved | N | N | N | N | N | |
. | unreserved | N | N | N | N | N | |
/ | gen-delim | - | m | m | N | N | |
0 | unreserved | N | N | N | N | N | |
1 | unreserved | N | N | N | N | N | |
2 | unreserved | N | N | N | N | N | |
3 | unreserved | N | N | N | N | N | |
4 | unreserved | N | N | N | N | N | |
5 | unreserved | N | N | N | N | N | |
6 | unreserved | N | N | N | N | N | |
7 | unreserved | N | N | N | N | N | |
8 | unreserved | N | N | N | N | N | |
9 | unreserved | N | N | N | N | N | |
: | gen-delim | - | m | N | N | N | 2 |
; | sub-delim | - | m | m | m | m | |
< | - | Y | Y | Y | Y | ||
= | sub-delim | - | m | m | m | m | |
> | - | Y | Y | Y | Y | ||
? | gen-delim | - | m | m | N | N | |
@ | gen-delim | - | m | N | N | N | |
A | unreserved | N | N | N | N | N | |
B | unreserved | N | N | N | N | N | |
C | unreserved | N | N | N | N | N | |
D | unreserved | N | N | N | N | N | |
E | unreserved | N | N | N | N | N | |
F | unreserved | N | N | N | N | N | |
G | unreserved | N | N | N | N | N | |
H | unreserved | N | N | N | N | N | |
I | unreserved | N | N | N | N | N | |
J | unreserved | N | N | N | N | N | |
K | unreserved | N | N | N | N | N | |
L | unreserved | N | N | N | N | N | |
M | unreserved | N | N | N | N | N | |
N | unreserved | N | N | N | N | N | |
O | unreserved | N | N | N | N | N | |
P | unreserved | N | N | N | N | N | |
Q | unreserved | N | N | N | N | N | |
R | unreserved | N | N | N | N | N | |
S | unreserved | N | N | N | N | N | |
T | unreserved | N | N | N | N | N | |
U | unreserved | N | N | N | N | N | |
V | unreserved | N | N | N | N | N | |
W | unreserved | N | N | N | N | N | |
X | unreserved | N | N | N | N | N | |
Y | unreserved | N | N | N | N | N | |
Z | unreserved | N | N | N | N | N | |
[ | gen-delim | - | m | m | m | m | 2,3,4 |
\ | - | Y | Y | Y | Y | 1 | |
] | gen-delim | - | m | m | m | m | 2,3,4 |
^ | - | Y | Y | Y | Y | 2,3,4 | |
_ | unreserved | - | N | N | N | N | |
` | - | Y | Y | Y | Y | 3 | |
a | unreserved | N | N | N | N | N | |
b | unreserved | N | N | N | N | N | |
c | unreserved | N | N | N | N | N | |
d | unreserved | N | N | N | N | N | |
e | unreserved | N | N | N | N | N | |
f | unreserved | N | N | N | N | N | |
g | unreserved | N | N | N | N | N | |
h | unreserved | N | N | N | N | N | |
i | unreserved | N | N | N | N | N | |
j | unreserved | N | N | N | N | N | |
k | unreserved | N | N | N | N | N | |
l | unreserved | N | N | N | N | N | |
m | unreserved | N | N | N | N | N | |
n | unreserved | N | N | N | N | N | |
o | unreserved | N | N | N | N | N | |
p | unreserved | N | N | N | N | N | |
q | unreserved | N | N | N | N | N | |
r | unreserved | N | N | N | N | N | |
s | unreserved | N | N | N | N | N | |
t | unreserved | N | N | N | N | N | |
u | unreserved | N | N | N | N | N | |
v | unreserved | N | N | N | N | N | |
w | unreserved | N | N | N | N | N | |
x | unreserved | N | N | N | N | N | |
m | unreserved | N | N | N | N | N | |
z | unreserved | N | N | N | N | N | |
{ | - | Y | Y | Y | Y | 3,4 | |
| | - | Y | Y | Y | Y | 2 | |
} | - | Y | Y | Y | Y | 3,4 | |
~ | unreserved | - | N | N | N | N |
- WHATWG: “\” is treated as a “/” in path segment
- WHATWG: character not encoded in path
- WHATWG: character not encoded in query
- WHATWG: character not encoded in fragment
Y | always encode |
N | never encode |
m | only encode when not used for their defined purpose (URI scheme dependent) |
- | not allowed, even escaped |
unreserved | characters that never need to be encoded |
gen-delim | characters defined as general use delimiters |
sub-delim | characters defined as a potential delimiter for subcomponents in a URI |
escape | character defined to escape other characters |
characters not otherwise defined, and thus must be escaped. |
Furthermore, any character not defined in the above table (character codes 0 to 31 and 127 or higher) must also be escaped.
References
- Uniform Resource Identifier Schemes
- URL Interop
- URL Specification
- A practical guide to URI encoding and URI decoding
- (Please) Stop Using Unsafe Characters in URLs
- Exploiting URL Parsers: The Good, Bad, And Inconsistent
Notes on an overheard conversation about The Great American Tag Sale with Martha Stewart
“I think Martha's spent too much time hanging with Snoop Dogg.”
“What makes you say that?”
“Look at her! Her dress, her hair, the 50-yard stare into nothing.”
“Maybe you're not used to seeing her at home.”
“Maybe … ”
“Besides, maybe she learned that while in prison.”
“Oh yeah! She did do time in the pokey, didn't she?”
Thursday, June 02, 2022
Optics
I just saw the following in my work email:
- From
- EMPLOYEE COMMS <EMPLOYEE-COMMS@XXXXXXXX>
- To
- XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- Subject
- XXX Security Tip: The PDF Attachment Scam
- Date
- Thu, 2 Jun 2022 17:56:44 +0000
Please see attached XXX Security Tip information.
[Image file of the company logo and the following text: Thank you, Any questions, or concerns, Contact us: security@XXXXXXXX or (hotline) +1-XXXXXXXXXXXX]
“Security isn’t something you buy, it’s something you do, and it takes talented people to do it right.”
The attached document?
A PDF file.
Tuesday, June 07, 2022
Would you rather go up against Stormtroopers, or Ewoks?
This is a cool little video about the age old question: “Are Stormtroopers bad shots?” It also manages to answer the question “Are Ewoks that dangerous?” You might be surprised by the answer.
FizzBuzz the metal rock power ballad version
Another video for you today: FizzBuzz in the Rockstar programming language, with the FizzBuzz riff, a series of notes based upon FizzBuzz.
Your welcome.
Wednesday, June 08, 2022
Knocking on silicon
It's Sunday, and I go to check my email, only I can't. For some reason, the mouse isn't moving to the Linux screen. Odd. Perhaps Synergy crashed or something, so it's good thing I also have a KVM switch installed as well. Only the keyboard isn't working.
Sigh.
Swapping the keyboard connections on the KVM didn't work. Plugging in the keyboard cable for the Mac into the Linux system didn't work.
Sigh.
Okay, let's try rebooting. Only the machine didn't come back up.
Sunday had been bad enough. I decided not to think about the computer at all.
Until today.
It was a rough few days and not without some panic. While I do have a backup of the files on the machine, it's a few months out of date, and the backup drive is a native Linux file system, so even if I had a current backup, I wouldn't be able to read it on the Mac (Important note: transfer some critical files to the Mac). And I run my own DNS server on the Linux box, so I had to work around that for a few days.
It turns out the computer is mostly fine—it's just the video card got fried somehow. And now I'm having to:
- set up a regular backup scheme;
- find a replacement video card;
- and get used to one screen in the meantime …
Wednesday, June 15, 2022
Notes on an overheard conversation about some mail
“Oh look! Someone sent you a quarter!”
“No, dear, that's a nickel.”
“Well, I guess that shows you how long it's been since I've seen a coin.”
Thursday, June 16, 2022
Notes on another overheard conversation at a doctor's office
“That wasn't so bad. I didn't hear you scream this time.”
“You wouldn't believe the advancements they've made in ball gags.”
Saturday, June 18, 2022
I think this is a scam email, but I'm not sure how it works
I received the following email last night:
- From
- Sergio Rios <XXXXXXXXXXXXXXXXXXXXX>
- To
- Undisclosed recipients:;
- Subject
- transfer done
- Date
- Fri, 17 Jun 2022 22:43:09 +0000
Hi Carlo,
Trust you are having a good day. As earlier discussed in our last week meeting, your bitcoin wallet has been funded with 48 .99 BTC making a total of 1,433,296.04 USD. Please login with below details to confirm your BTC balance.
Website : XXXXXXXXXXXXXXXX
Customer ID : XXXXXXXX
password: XXXXXXXXXI’ll be joining the team coming week for a symposium in Switzerland. Give me a call if anything else is needed.
Regards,
Sergio
I don't understand this email. It was sent to an email address I have that is the target of a lot of spam (it's the address I use for my domain registration and as I never opted to “hide” that email address, it's gotten around to a lot of spam lists), although the “undisclosed recipients” kind of gives it away as spam anyway. The website exists (I checked DNS) but I have not visited the site, so I don't know of the customer ID or password actually work. A quick web search on the domain name has revealed a lot of suspicion about the website, and a search on “Sergio Rios” doesn't reveal anything either.
So I have to wonder—what's the angle here? What's the scam? How is this supposed to work?
Monday, June 27, 2022
Cleaning cobwebs off the web browser
Firefox was giving me fits. It would work fine except when I quit the application, then it would just sit there consuming over 100% of the CPU (not hard given I have multiple CPUs on the machine). I even rebooted the machine on the off chance that Firefox thought it was running on a Windows box and not a Mac. But no, Firefox kept freezing when I quit the application.
I was afraid that somehow some critical file got corrupted and that I would have to nuke all my Firefox settings and start over again. But before I do that, let me just … maybe? clear the cache?
Yup. Clearing the cache fixed the issue. And upon thinking of it, I don't think I've bothered to clean the cache since … 2015? Really? It's been that long? Wow.
Tuesday, June 28, 2022
Notes on an overheard conversation on who will be driving
“Do you want to drive?”
“No thank you, I'd rather not drive to my doom.”
“It's not to your doom!”
“Okay, okay! I'd rather not drive to my mild annoyance.”
Wednesday, June 29, 2022
“These are not the child processes you are looking for.”
At The Enterprise, QA asked if they could have a tool that starts all our stuff up so they can do some performance tests (there are reasons they're asking for this, and why I agree with them that go beyond the scope of this entry). I replied I would see what I could do—it can't any harder than what I've done so far. And I came across an interesting bug.
The program will take our existing test cases, generate all the data and output a list of all the phone numbers so QA can use whatever they use to generate appropriate traffic. Then it will start up all the appropriate programs and just sit there, monitoring the processes such that if any stop, it stops the rest of them. And then QA can run whatever they run to inject requests into the maelstrom at whatever rate they see fit.
The bug in question: due to how the code was being written,
I was slowly moving code to catch two signals,
SIGINT
(the interrupt signal) and SIGCHLD
(a child process has terminated) closer and closer to the start of the program
(for various reasons not germane to this entry).
At one point,
the program was always stopping because it thought one of the programs being tested has crashed when it hadn't.
I was able to isolate it—this code:
local tests = load_tests(arg) signal.catch('child') signal.catch('int')
worked, while
signal.catch('child') signal.catch('int') local tests = load_tests(arg)
failed.
I then had a look at load_tests()
so see what in the world might be going on,
when I saw this:
os.execute("/bin/rm -rf dump/") -- other code local foo = io.popen("mkfoo lnp.dat","w") local bar = io.popen("mkbar sup.dat","w") -- other code
I was executing other programs to generate the data,
and those processes exiting were sending SIGCHLD
that the program
(and I)
were not expecting.
Huh … leaking abstractions for the bugs!
Friday, July 01, 2022
“I'm sorry I mistakenly sent that to you. By the way, how are things with you?”
The fact that these scammers never include the pitch in their opening texts makes them seem confusing and mysterious. But the scam itself is an old and obvious one. If you respond (with “wrong number,” say) the scammer will attempt to draw you into conversation …
This is the first step in what is, at its core, an old-fashioned “romance scam,” in which the scammer exploits a lonely and/or horny person by faking a long-distance, usually romantic relationship. After the scammer has gained the trust of their victim, they convince them to transfer money, often for an investment; in some cases, the victim can be enticed into several successive transfers before they realize they’re being played.
Via Hacker News, What's the deal with all those weird wrong-number texts?
I think this explains the weird email I received two weeks ago. It's either that, or maybe (with low probability given the email sent) a form of “crypto drainer” as was suggested.
Wednesday, July 13, 2022
I think Mac OS-X is wrong in this case, and Linux is right
There was apparently a frantic bug-hunt involving “Project: Lumbergh” yesterday that I was not involved in. From the description of the bug, it certainly sounded like it was a manifestation of “undefined behavior” as “Project: Lumbergh” was actinging differently between Linux and Mac OS-X (our testing and development platforms). The issue stemmed from this bit of code (not the exact code, but similar in nature):
x = strtol(input,&endptr,10); if ((errno == 0) && (endptr != input)) /* ... */
And the fix:
errno = 0; /* BUG FIX */ x = strtol(input,&endptr,10); if ((errno == 0) && (endptr != input)) /* ... */
In fact,
there are man pages that show setting errno
to 0 in sample code.
Here's what the C99 Standard says about library functions setting errno
:
The value of
errno
is zero at program startup, but is never set to zero by any library function. The value oferrno
may be set to nonzero by a library function call whether or not there is an error, provided the use oferrno
is not documented in the description of the function in this International Standard.C99 Standard, section 7.5.3
So no function will ever set errno
to 0.
To me,
this sounds like some earlier failed function causing a false problem in this bit of code,
thus the fix to set errno
to 0.
Now,
the C99 Standard says this about strtol()
:
The
strtol
,strtoll
,strtoul
, andstrtoull
functions return the converted value, if any. If no conversion could be performed, zero is returned. If the correct value is outside the range of representable values,LONG_MIN
,LONG_MAX
,LLONG_MIN
,LLONG_MAX
,ULONG_MAX
, orULLONG_MAX
is returned (according to the return type and sign of the value, if any), and the value of the macroERANGE
is stored inerrno
.C99 Standard, section 7.20.1.4.8
And indeed,
on both Linux and Mac OS-X,
for values outside the given range,
ERANGE
is returned.
The issue happens when no conversion happens.
Both systems return 0,
but Linux doesn't set errno
whereas Mac OS-X does.
In fact,
Mac OS-X sets errno
to EINVAL
,
which isn't even defined in the C99 Standard
(but it is defined for POSIX).
I think Mac OS-X has the wrong behavior here.
errno
is documented in the description of strtol()
(but only for one error case),
so Mac OS-X shouldn't be (in my opinion) errno
when there's a conversion issue.
It may be a moot point though, as the fix appears to be working as intended on both systems.
Monday, July 18, 2022
A screed against modern consumer electronics
So I'm watching this video on a mouse/scanner combo thing when at the very end, Cathode Ray Dude goes on a rant about the companies that make modern consumer electronics that I found amusing. I also think it's a sad state of affairs that I agree with his sentiment that most consumer electronic companies exist just to get bought out and not to sell a viable product.
Wednesday, August 31, 2022
We could have skipped writing a program
A bit of background—“Project: Sippy-Cup” uses data from a single column from a database to do its job. It doesn't query the database directly since we have a tight deadline, so there's a custom binary file that contains around 100,000,000 records, each record having a unique key and a 32-bit value. It doesn't matter what the key or the value is, just that this file exists. So, with that out of the way …
I was at lunch today with some fellow cow-orkers. Talk turned towards a QA engineer who was tasked by my friend TS (a senior QA engineer) to write a program to scan the data file used by “Project: Sippy-Cup” and count each unique value. I had written such a program in Lua (which worked by directly reading the binary file itself—easy enough since “Project: Sippy-Cup” is in Lua and has to read the binary file). TS wrote one in Python to do the work from a text dump of the binary file. The text output is just:
unique-key-1 = value unique-key-2 = value
It's not hard to parse, it's just that the text dump is 100,000,000 lines long.
The QA engineer in question couldn't get his program to work.
It was only after lunch did I realize that none of us had to write a program. No, all it would have taken was running:
GenericUnixPrompt> dump-proprietary-data -s Project-Sippy-Cup.data \ | awk '{print $3}' \ | sort \ | uniq -c \ | sort -rn \ > /tmp/report.out
Sigh.
Wednesday, September 21, 2022
Just how much telemetry does The Enterprise need from my work laptop?
I couldn't get rid of Satan, the useless Windows Laptop fast enough. At the end, just turning on Satan swamped the network connection here at Chez Boca to be near useless. Good riddance.
Today, I turn on Satan's replacement, Belial, the annoying Mac Laptop. I'm not sure what The Enterprise is doing to it, because as soon as I turned on Belial, the network connection here at Chez Boca dropped to near zero.
At first, I thought it might have something to do with the weather, but on a hunch, I turn Belial off and the network becomes stable and usable. I turn Belial back on, and the network goes crazy again.
Sigh.
Thursday, September 22, 2022
So when did POP and IMAP become a “legacy protocol?”
I just received the following email at work:
- From
- Enterprise Services <XXXXXXXXXXXX>
- To
- Conner, Sean <XXXXXXXXXXXXXXXX>
- Date
- Thu, 22 Sep 2022 19:27:12 +0000
- Subject
- Legacy Email Protocols to be Retired
To All [The Enterprise] Staff,
Please be aware that Microsoft is disabling the use of several legacy protocols related to old methods of retrieving and sending email in the coming weeks.
What does this mean for you?
If you are using Outlook as [The Enterprise] provisions it, you do not need to do anything. If you use an alternate email program that relies on POP3, IMAP or SMTP, like native mail programs in iOS and Android, expect your [The Enterprise] email connection for that app to stop functioning when Microsoft chooses to disable these protocols.
What should you do?
[The Enterprise] only supports the use of the approved email client Outlook, Outlook 365 available in our [The Enterprise] tenant, or Outlook for iOS and Android. If you are not using Outlook, please switch today. If you require assistance, please contact XXXXXXXXXXXX.
Instructions for installing Outlook for mobile devices can be found here [Link to internal documentation removed. —Editor].
Thank You,
Enterprise Services
On one level, this doesn't bother me. I'm using the web version of Lookout (I assume that's the Lookout 365 for The Enterprise tenant they mention, at least, I hope so). I also don't check work email on my phone—never have, and I don't have plans on starting that any time soon either.
But on another level, this is concerning. Even though Microsoft announced this three years ago, it comes across as locking email down into a more centalized, proprietary system. I do have to wonder how long until Google decides that only certain clients can connect with Gmail? You know, for “enhanced security” or a “better experience.” I don't use Gmail, but I do have concerns about my ability to run my own email server and general interoperability with the large email providers like Google and Microsoft.
Update on Friday, September 23rd, 2022
This was posted to Lobsters, so go there for some more commentary.
Update on Friday, November 25th, 2022
I just found out this made Hacker News. And of course half of the comments are about the lack of HTTPS on my site. Heh.
Saturday, September 24, 2022
You move sixteen tons, and what do you get? Another inch over and deeper in exhaustion
So for reasons, Bunny and I are moving items from one storage unit company to another, and as part of that move, we're consolidating into a larger storage unit. We have a 10′×15′ storage unit (3m×5m for those who use sane units) with a garage type door.
The shelving units we use are these plastic module units that are easy to knock down and put back up. We have one wall lined with shelves already. Yesterday, we moved three more shelves into the unit along the opposite wall. Bunny was concerned about having enough space to close the garage door, but I assured her we had plenty of room by laying down one yet-to-be-installed shelf on the floor and showing that it fit into the space and didn't extend beyond the opening.
So today, we were moving another shelving unit into the new unit. I put up the shelving unit and went to test the placement by shutting the garage door.
That's when I found this small flange on the bottom the door. It's only about an inch wide (2.5cm) but it was wide enough to hit the newly installed shelving unit when closing the door. Then I was looking at three full shelving units that needed to shift over an inch or two (5cm). What's the saying? Measure twice, cut once?
Yeah.
So with no other option, I started to unload the shelves …
Sigh.
Monday, September 26, 2022
Outvoted
So without going into too much detail, there was a disagreement about the implementation of a feature at The Enterprise. The “feature” is just marking a particular type of account and having the ability to test it. At first, it was a disagreement between two people, one who wanted the feature supported, and the other who didn't, and the one who didn't want the feature implemented won by being more stubborn. But when the rest of The Enterprise (or at least multiple other departments and way more than just the initial two people involved) want the feature to be implemented, it seems unwise to me that the person who insists on not implementing it to double down on not implementing it.
Wednesday, October 05, 2022
Thus spake the master programmer: “time for you to leave.”
Read enough of my posts over the past year or so, and it's clear that I am not happy working at The Enterprise. The process über alles, the overly managed and useless laptops, the bad communication (which I don't think I've mentioned, but man, I didn't expect the telephone game to be an actual strategy of a company), the so called “agile development” that is anything but agile, the twice daily scrum meetings (because my manager wanted his own scrum meeting with just the team with no other departments involved—that's the other daily scrum meeting), and the testing.
Oh god the testing.
Everything is about the testing.
Testing über alles.
And as for my actual job—development? I have modified a grand total of 71 lines of production code over a period of six months, about a third of which was rejected in code reviews as being “too much of a code change.”
So on August 26th during my one-on-one with my manager, where the topic of conversation drifted towards testing (yet again), I had had enough and decided to leave The Enterprise as I felt like I wasn't a cultural fit. I made my intentions clear on Monday, August 29th, and immediately took all my remaining time off (three weeks worth), followed by the standard “two weeks notice period,” where I was in multiple “transfer-of-knowledge” meetings. It's indicative of the thought process of The Enterprise that most of the “transfer-of-knowledge” meetings were about … testing. Or rather, the testing tools I had written and how they work.
It was time for me to leave. There were a few red flags indicating that perhaps I should have left earlier (such as the rest of my team leaving the company at the same time) but after twelve years, it was probably time.
Yesterday was my last day at The Enterprise. Today is the first day of a long needed rest. Now I just have to figure out what to do with the error code from the trap frame …
Thursday, October 06, 2022
Get thee behind me, Belial
I walk into the Computer Room at Chez Boca to find a box sitting on my chair. It's a sealed and empty box that was shipped to me via FedEx. It can only mean one thing—it's the box to ship Belial, the annoying Mac Laptop back to The Enterprise. I was talking to The Enterprise about this on Tuesday (my last day there) and what do you know, here's the box.
So now Belial is packed and tomorrow I shall drop it off for its voyage home.
Saturday, October 08, 2022
What is a “unit test?”
Despite the emphasis on testing at The Enterprise, no one there was able to answer the simple question I would often ask, “what is a unit test?”
On thinking about it since I left, I don't think there's an answer to that question. I'm thinking it really depends upon the language being used, and it's a similar concept to Design Patterns, a collection of patterns seen in Smalltalk development and later forced onto other languages, applicability be damned.
Since most of the coding I do is in C, a “unit” would most likely be a function, or maybe a collection of functions known colloquially as “a library.” The various components I worked on, like “Project: Lumbergh” or “Project: Sippy-Cup” aren't libraries, and most functions in those projects are single use that exist just for organizational sake, so of course the “unit” ended up being the entire program.
But I'm also looking at some of my own projects,
like mod_blog
.
There's a fair number of stand-alone functions in here I could possibly “unit test” if I were inclined.
The first one is this function (found here):
int max_monthday(int year,int month) { static int const days[] = { 31,0,31,30,31,30,31,31,30,31,30,31 } ; assert(year > 1969); assert(month > 0); assert(month < 13); if (month == 2) { /*---------------------------------------------------------------------- ; in case you didn't know, leap years are those years that are divisible ; by 4, except if it's divisible by 100, then it's not, unless it's ; divisible by 400, then it is. 1800 and 1900 were NOT leap years, but ; 2000 is. ;----------------------------------------------------------------------*/ if ((year % 400) == 0) return 29; if ((year % 100) == 0) return 28; if ((year % 4) == 0) return 29; return 28; } else return days[month - 1]; }
I'm sorry, there's no way I'm going to even waste time writing unit tests for a function this simple. I didn't bother when I first wrote it in Debtember of 1999, and there's no point in writing one now. Even if the leap year rules change in 1,980 years, I probably still won't write unit tests for this function (probably because I'll be dead by then, but that's besides the point).
But that's not to say there aren't other functions that couldn't be “unit tested.” The next one I have in mind is simple, but I would love to see a unit test purist tell me how they would write a unit test for it.
Update on Friday, Debtember 23rd, 2022
Discussions about this entry
- Re: What is a unit test
- Re: What is a "unit test"?
- Re: What is a "unit test?"
- Re: What is a “unit test?”
Monday, October 10, 2022
Non-terrestrial calendars
The Martian Business Calendar describes a hypothetical calendar for Mars. It's only marginally more complex than the Gregorian calendar we currently use and the tradeoffs made are interesting to read about (it includes a “leap week” instead of our “leap day” and the reasons for it are both rational and irrational at the same time, depending upon your point of view). But what I would like to see is a Venusian calendar—what tradeoffs have to be made as the Venusian day is longer than the Venusian year.
An answer to my question about unit tests
I was browsing Gemini when I came across a reponse to my unit test question:
Sean Conner poses this question.
The answer is actually more sensible in C than it was in Smalltalk: a unit is a compilation unit. In C, it is a file.
Any changes to source will require changes to a file. Once a source file is altered, it may screw something up in the resultant binary. Therefore, there should be a unit test to check that the altered unit behaves as expected.
…
The easiest way to think of it in C is: assume make's view of the system.
That is not a bad answer for C. In fact, it's probably not a bad answer for several different languages. The only clarification I can see being made is to only test non-static functions (functions that have visibility outside the file they're defined in) and not have specific tests for static functions (functions that only have visibility to code in the C file) to allow greater flexibility in implemenation and prevent tests from breaking too often.
Wednesday, October 12, 2022
The clock that came in from the cold
The other day, my grandfather clock wasn't chiming correctly, so I took off the top portion (I think it's called a “hood”) and placed on a near-by chair. That was not a good idea as a few moments later, as I had my nose buried in the internals of the clock, I heard a horrible crash as the hood fell off the chair and was damaged.
Sigh.
Bunny and I dropped off the hood at Josef & Joseph for rapair. Back home, Bunny found a box, cut the side off of it, and placed it on the clock to keep the dust out of the clockworks.
Maybe it's just me, but it looks like it's visiting from Canada.
Tuesday, October 18, 2022
These have to be legit offers for writers
I received the following two emails just minutes apart, although one was sent much earlier than the other. The first one I received (names have been changed; capitalization has not):
- From
- Ken Lee <XXXXXXXXXXXofficial32@gmail.com>
- To
- Sean <sean@conman.org>
- Subject
- Guest post proposal
- Date
- Tue, 18 Oct 2022 01:20:19 -0400
Hi,
My name is Ken lee. I was wondering: do you accept guest posts on http://www.conman.org/? I’ve been brainstorming some topics that I think your readers would get a ton of value from my post. I am already writing regularly for techstuff.example.net.
If you are interested please revert back to this mail.
All the best
And the second one (again, names have been changed, capitalization has not):
- From
- Nosmo king <XXXXXXXXXXXXXofficial@gmail.com>
- To
- Sean <sean@conman.org>
- Subject
- Date
- Mon, 17 Oct 2022 22:31:09 -0700
Hi,
My name is Nosmo King. I was wondering: do you accept guest posts on ? I’ve been brainstorming some topics that I think your readers would get a ton of value from my post. I am already writing regularly for techstuff.example.net.
If you are interested please revert back to this mail.
All the best
Nice to know their email address are their “official” email addresses.
And Yes, the second one to arrive before the first one was sent—it just took its time getting to me. It also seems the first sender had … um … issues with sending the email as the subject line is missing, and no mention of my website. I did check techstuff.example.net and found Ken Lee (not his real name) has written articles there. And the photo attached to Ken Lee appears to be him. I did not find any writer named Nosmo King (again not his real name) on the site. Perhaps he's a new writer there? I just found it amusing that two “different” writers, writing for the same site, decided to send me the same email shilling their work.
I'm also wondering if they expect me to pay for these articles, or are they doing it just for the exposure?
I'm thinking they're expecting to get paid.
Wednesday, October 19, 2022
The perils of selling out
I received some email from Remy about my post yesterday where they sent along some related links. They received their own badly written sponsored post email, and also linked to Kev Quirk's badly written sponsored post email. I was then reminded of the time I sold out to get that sweet-sweet sponsored money (it wasn't much—about $100 for seven ads) and the aftermath five years later.
Monday, October 24, 2022
Committing to the bit
Earlier this year I had to commit to a time for our yearly trip to Brevard and due to some deadlines at The Enterprise, the last week of October appeared to be the best time to go. Yet I was aprehensive about it because I had already used an unplanned week for my own mental health at the insistence of my second line manager (who I thought was my new manager but turned out not to be the case, which I still have to write about) because of the increasing amounts of stupidity and this would leave me less time to take off in Debtember (first world problems, I know).
Of course things were resolved at The Enterprise when I left for good and yet, here was the looming trip to Brevard. In the immediate aftermath of leaving work, I was unsure if I still wanted to go, and Bunny (who loves to travel) kept asking if we were going. I said yes, but even as late as last week, Bunny came and said that if I didn't want to go, we didn't have to.
But as I told Bunny, I had commited to the bit after watching this John Green video. Despite the grueling twelve hour drive, the freezing tempuratures (at least to us Floridians) and a change in staying location (more on that after we get there), I think the change of scenery (we're hitting peak leaf season! Woot!) and the break in routine is something I desperately need at this point.
Today,
we're headed towards The Emerald City Brevard!
We have arrived!
We've arrived in Brevard!
The trip itself went smootthly.
Well,
except for one small disagreement about where the I-95N on-ramp was located when we were somewhere in Georgia getting gas.
And the 18-wheelers hogging up the highway in South Carolina lowering our average speed for the trip.
Oh,
and there was the bumper-to-bumper traffic just south of Hendersonville on I-26W.
Other than that,
what have the Romans ever done for us? the trip was just smooth sailing.
When Bunny and I first started our yearly trips here, we stayed in the The Inn at Brevard, on the east end of town. A few years later we started staying at The Red House Inn, located on the west end of town. Unfortunately, the owners sold the place just after our visit last year, and now the Red House Inn is a private residence. The previous owners still have propery they were willing to rent out, but they're houses, and Bunny and I don't need an entire house for a vactation.
So we decided to try a new bed and breakfast, this time back on the east side of town, The Bromfield Inn, about a block away from The Inn at Brevard.
The suite we have feels like it's the size of Chez Boca.
And it includes His and Hers changing areas:
And the bathroom … well, the bathroom is getting a post of its own.
Update on Friday, November 4th, 2022
Extreme bathrooms, Brevard NC edition
So, the bathroom in our suite at The Bromfield Inn. The only time I've seen as large a bathroom was at the house of John the paper millionaire of a dotcom. The entrance:
Okay, it's the entrance to the changing rooms and then the bathroom. What you can't see is just how large it is.
Here's the vanity from the entry:
And the reverse shot, with me standing in the bathtub in an attempt to get back far enough to get a picture:
And because I've yet to fit it all in, the bathtub:
But in addition to the his-and-her vanity, the bathtub built for two, and the makeup desk, we have the shower and watercloset!
And again, I have to stand in the bathtub to get this shot of the shower. I think it's mandatory for bed-and-breakfasts here in Brevard to have multiple shower heads. I'm not complaining; neither is Bunny.
And finally, we have the water closet with a piece of equipment I've only heard of and until now, have never actually seen one:
A bidet!
I did not realize just how high that shoots water. Just letting you know.
And that ends our tour of the cavernous bathroom in our suite.
Seriously, the bathroom is about the size of our family room back at Chez Boca. It may even have its own area code. I may have to ask about that …
Update on Friday, November 4th, 2022
Tuesday, October 25, 2022
Extreme shopping carts, Brevard, NC edition
It was a quite day today. We got up late, stopped by WallyWorld for some incidentals, had some food then back to The Bromfield Inn to rest.
But late in the evening we stopped by Ingles for some snacks. Ah, Ingles. You are always good for some extreme stuff (for the record, the General Interest reading section hasn't changed a bit) and this time is no different:
I was half-expecting some Aussie to come strolling up with a 10′ (3m) cart and saying, “That's not a cart, this is a cart.”
Wednesday, October 26, 2022
Exteme mazes, Brevard NC edition
It's late October. I'm in Transylvania County. And I come across this sign:
This place? This time? I think I'll skip the labyrinth.
Thursday, October 27, 2022
Enoying the Big Blue Room
The sun is out, there's not a cloud in the sky, and the temperature is cool but not unbearibly so (for a Floridian) and I'm sitting out in, I guess for lack of a better term, the garden of The Bromfield Inn.
The only sound is the babbling of a fountain behind me. And before I could get the camera out, one of Brevard's famous white squirrels scurried past me. The entire scene is making me want to call out, “Jeeves! More tea please!”
Update at 5:02pm
A nearby church is giving an improptude concert with the church bells. I wasn't aware that American churches even had bells anymore. How neat!
Update on Friday, November 4th, 2022
Extreme lizards, Brevard NC edition
“Ah yeah. Ooh ahh. That’s how it always starts. Then later there’s the running and the screaming.”
Friday, October 28, 2022
Extreme puzzle room, Brevard NC edition
So there's this, embedded into the wall:
It can't be a remote, since it's not … um … remotable. It's part of the wall. Evern weirder, it's in the bathroom!
It might make sense if it had options like “water temperature” and “shower” or “bidet” but no, it's looks like it would control a media center that doesn't seem to exist anywhere in the suite. I asked the owner about it, and even she was clueless. Oh wait! This is an older house … could there possibly be a hidden room?
Update on Friday, November 4th, 2022
Saturday, October 29, 2022
Extreme pumpkin, Asheville NC edition
Bunny and I visited the Western North Carolina Farmers Market in Asheville. Pumpkins were everywhere, but then there was this one:
No need to carve that pumpkin; any spirits seeing this will flee away, screaming.
Sunday, October 30, 2022
Extreme bridge, Brevard NC edition
Today was a pretty miserable day here in Brevard, with it being both cold and rainy. Bunny and I didn't really go out much today. But a few days ago, when I sat in the garden I also took a stroll around the grounds here at the The Bromfield Inn. And it was on the grounds that I found this bridge:
I do have to wonder what, exactly, is the point of a bridge to nowhere. Maybe this is some Senator's idea of a make-work pork spending project.
Update on Friday, November 4th, 2022
Monday, October 31, 2022
Extreme clouds, Brevard NC edition
Still rainy here in Brevard and the clouds are very low in the sky here:
There is no real skyline anymore here, just trees disappearing into the mist …
Extreme, no, seriously, I mean it, extreme head in the clouds, Cleveland, SC edition
The clouds lifted, the sun came out, and round 3:30pm, I said to Bunny, “Let's go to Pretty Place!”
Pretty place, aka The Fred W. Symmes Chapel, is a church up on the side of a mountain, just across the North Carolina/South Carolina state line. 11 miles (18km) “as the crow flies” from The Bromfield Inn, or 17½m (28km) as the car drives. And I implore you, gentle reader, to check the link out, to get into the mind set Bunny and I were in as we headed out to Pretty Place.
Little did we expect fog. One minute, it was sunny. Then we rounded a bend and:
The last time I encountered fog this thick it was the late 90s, I was driving along Florida state 60 doing 70mph (110kmh) at 1:30am trying desperately to avoid a semi-truck reenacting “The Duel”. And the time before that I was in elementary school, being driven to Brevard Elementary School by my mom. But this? Today? This took us completely by surprise.
It gave the entire place this ethereal feel to it:
And then we entered the chapel and were greeted with:
It was surreal. You look out, and there's nothing but this white void.
Driving back down, the fog just … vanished … just as quickly as we entered it going up.
Tuesday, November 01, 2022
Extreme fortunes, Brevard, NC edition
Last week, Bunny and I ate at the Twin Dragons Grand Buffet (and if there was a website, I'd have linked to it). After dinner I checked my fortune cookie and lo:
This is the first time I've gotten a non-English Chinese fortune cookie fortune. Perhaps it says “Please savor your choosen bell appetizer” or something like that—I took German as a foreign language, not French.
Wednesday, November 02, 2022
There and back again
Not much to report other than we made it home. I-95S in South Carolina was a horror show, what with a segment where it took us nearly an hour to travel 3 miles, and the I-95 exit to our house was closed off, but other than that, it was a long and gruelling trip.
Friday, November 04, 2022
Extreme disappointment, Brevard NC editon
Bunny and I choose The Bromfield Inn because The Red House Inn was no longer, having been sold last year. The Bromfield Inn is beautiful, but it turns out it's not suitable for us.
The owners are new to the bed-and-breakfast business, having bought The Bromfield Inn just a few months ago (and after we had made reservations with the previous owners). The owners are a mother and daughter team, and thus, I will refer to them as Mother and Daughter. Mother lives on site and currently manages the inn, while Daughter is still in Florida closing up her real estate business there.
Anyway, as Bunny wrote in a follow-up email to The Bromfield Inn:
- From
- Bunny <XXXXXXXXXXXXXXXXXXXXXX>
- To
- The Bromfield Inn <XXXXXXXXXXXXXXXXXXXXXXXX>
- Subject
- Re: Thank you for your stay
- Date
- Thur, 3 Nov 2022 16:01:00 -0400
Good morning!
We did indeed enjoy most of our stay at your beautiful inn. However, we will have to seriously consider our options for future visits. The deal breaker is the lack of heat in the room. Neither of us is wired for being cold, and we were very cold most of the time. Except when we were in a hot, steamy shower! I would imagine that even in July, the temperature on our floor would be set for A/C, 70 degrees [21°C —Editor] or so, which would mean we would have the same problem then as we did in October. Not sure this is a good fit for us.
Other things include the lack of access to a microwave to reheat what we would bring home from our dinners. We also use a great deal of ice every day for our water and soft drinks. You have no ice machine for your guests. That matters to us.
Our suite was lovely, especially the dressing room area and the roomy shower. However we would trade all of that for temperature control in our room. We have never stayed anywhere in our 15 years together where we could not control our room temp. We were quite surprised. We would not have decided to stay at your inn, had we known there was no heat in the room. It never entered our minds.
We appreciate your proximity to downtown, and Sean enjoyed sitting outside by the fountain, as long as it was sunny. As soon as the sun went behind the trees, he came inside. It was lovely for him to be able to have his bi-weekly game at your dining table in front of the fireplace. Heavenly. But you can see the problem. He was comfortable by the fire. We aren't wired for these temps, and we acknowledge that people come to Brevard precisely for these cool days and nights. We are the exception. We realize that. But, again, we have never experienced a room with no heat.
Mother's hospitality was beyond belief. She is a jewel. We tried to work with her as much as possible. You probably don't have many guests who spend most of their visit in their rooms, and we tailored our activities to correspond with her schedule whenever we could. We especially appreciated that she printed my NYT crossword puzzles.
We will keep you in mind! Thanks, again!
Bunny and Sean
Bunny felt bad about not wanting to stay there and felt it necessary to info The Bromfield Inn as to our decision. Explanatory feedback like this is more important than a good review. For someone to take the time to write about the difficulties shows they really care about the service (or stay, or product or whatever). How you reply to such crticism is important. For example, this is not the reply you want to send back:
- From
- The Bromfield Inn <XXXXXXXXXXXXXXXXXXXXXXXX>
- To
- Bunny <XXXXXXXXXXXXXXXXXXXXXX>
- Subject
- RE: Thank you for your stay
- Date
- Fri, Nov 4 2022 09:47:45 -0400
Bunny,
So sorry to hear this. Maybe Key West, FL temperature would be better for your trips and an air bnb seems more to your style.
We follow hotel industry standards to keep a building temperature with multi-guests. We set our units at 70 degrees [21°C —Editor] in the winter and 76 degrees [24°C —Editor] in the summer months. The U.S. Department of Energy suggests thermostat settings be 68 degrees F [20°C —Editor] or lower in the winter and 78 degrees F [26°C —Editor] or higher in the summer. I know everything was done to keep you and Sean comfortable. We have rooms that have fireplaces so that might be an option as well.
We won’t ever have microwaves in any rooms or common areas. The smell of food thru out the Inn that is not controlled via our kitchen isn’t something other guests want to experience. We also don’t want to attract ants, etc. from food being in the rooms.
Unfortunately, on our end, you had a red pj set that bleed on to our sheets. Our sheets are bamboo and a set costs $325, which I will need to charge back to you. Let me know if you have a different credit card you want charged? I am happy to mail you the now pink king size sheet set but clearly they can’t be used anymore.
Sincerely,
Daugher
Mother
Innkeepers
The Bromfield Inn
www.thebromfieldinn.com
828-577-0916
Bunny took offense at the Key West line, and was livid at the addtional $325 charge for the ruined bedsheets (an accident, and certainly was not intentional).
I realize that it may be difficult and expensive to retrofit a house to have separate climate control for each room, but even at The Red House Inn (another formerly private residence turned bed-and-breakfast) they were able to provide us with some spot heaters to take the edge off the temperature. And they had ice.
Also not mentioned was the one-ply toilet paper. At the prices we were paying to stay there I would have expected better, but I'm sure that it too, was “hotel industry standards” to use such rolls. If you are going for an up-scale ambiance, it's a weird thing to pinch pennies on.
Suffice to say, we won't be staying there again, and we won't be recommending it to anyone either. Sad, because aside from a few issues that could be easily addressed, it is a nice place.
Tuesday, November 08, 2022
And hopefully, this means I stop getting SMS spam from politicians
It's political season. Not to be confused with deer season (or rabbit season, or duck season or even gator season, much to the dismay of many). And it's the second Tuesday of November on an even year, so it's also Federal political season.
Lovely.
The sun was out, the weather was cool (for Florida, which means the asphalt is only slightly soft from the heat) and as usual, I walked to the polling station. It wasn't crowded at all and it only took a few moments to fill out the ballot.
And while this should mean I no longer get SMS from politicians slinging mud at each other, there are still a few days of dire SMS warnings about the end of the world.
Sigh.
Wednesday, November 23, 2022
A rabbit hole of webmentions
How hard could it be?
It's relatively straightforward,
or so I thought until I started going down this particular rabbit hole yesterday.
The first stumbling block is sending a webmention.
The protocol itself isn't that tough—just send a POST
to an endpoint with the URL of my post,
and the URL of the post I mentioned and that's pretty much it.
But the issue I have is that I tend to link freely.
This one paragraph post has six links in it!
When I exclude links back to my own blog,
there are still three external links.
Do I check each one?
There are plenty of posts
(like this one)
where sending a webmention isn't something I want to do.
So I'm having a bit of analysis paralysis on how exactly I want to handle this.
For now,
it's a manual process.
The second issue is one of handling incoming webmentions. I get the incoming request, I make a bunch of checks, then I have to actually fetch a web page and therein lies the hook—HTTP is rather involved these days, what with three separate and largely incompatible versions and TLS to contend with and … well, I have some stuff I need to update before I can do all that. So for now, my webmention endpoint will just accept requests and email me the information. Maybe if I see just how much this is used in the wild will inform me of how much work would be involved. I don't know.
Thursday, November 24, 2022
Check out this check mate on Gobble Gobble Day
My friend Smirk is a Chess fiend. I'm not sure how good he is, but he does like Chess and I've played several odd variations on the game with him back in our college days. As such, I think he would really love this chess variant played on a sphere. It's kind of mind blowing how the game changes and the check mate at the end is incredible to see. It's quite the mental challege on this day of tryptophan overdosing.
Have a great gobble gobble day everybody!
Friday, November 25, 2022
You can program functionally in any computer language
A few days ago I wrote a comment on The Orange Site that seemed to strike a chord there. The comment was about applying a few principles of functional programming in any language (well, maybe not BASIC from the 70s or 80s, but these versions of BASIC aren't used much these days). There's no need for functional application, functional composition, first class functions, monads, (“Monads! How do they work?”) or even currying. No, I feel like you can get about 85% of the way to functional programming by following three simple principles.
Don't use global variables
This has the biggest effect on programming and it's been a well known principle for a long time, even before functional programming was a thing. Globals make it difficult to reason about code, since some function elsewhere can make an arbitrary change to a variable that can affect code called later on (or “spooky action at a distance” as Uncle Albert used to say).
While no global variables is the desirable goal,
I realize that it's not always possible to achieve and that threading global state through a program might be difficult,
but it does make for an interesting excersize to attempt it.
I recently went through the motions of removing all global variables from mod_blog
.
I've always wanted to reduce the number of global variables and in late August I felt it was finally time to do so
(the fact that I was frustrated by the micromanagement style of the Enterprise Agile system that was being forced on us at work had nothing at all to do with the sweeping and rapid changes.
Nothing at all <cough> <cough>).
Yes,
I had to snake a bit of state information throughout the code,
but it wasn't nearly as bad as I thought it would be.
And as a side effect, it does make it easier to test individual functions as there is no more global state to worry about (although I do need to finish talking about unit testing at some point).
Treat function parameters as immutable
This is the extreme take. A less extreme take is “treat reference parameters as immutable.” If the language you use passes variables by value, then there's less need to keep the parameters immutable as they won't change data from the function caller's perspective. But reference variables? Or variables passed by pointer in C/C++? Those you want to treat as constant data as much as possible. This again, makes it easier to reason about a function as it will only work on the values given to it, and not change them (that “spooky action at a distance” thing again).
If you can't avoid mutating parameters, at least try to indicate any possible mutation in either the function name or it's signature to let another programmer know what to expect.
Separate I/O from processing
Of the three principles, this is probably the easiest to implement. Gather the data, process the data, send the results. And if you can't do so for reasons (like there's too much data to fit into memory) there are are a few methods to keep them logically separated, for instance:
- input as much as you can handle, process that batch, send it out, repeat;
- have the processing code call a (possibly configurable) function for input and output (admittedly, this may be easy or hard depending upon the language);
- get more memory.
Even if you don't need the flexibility of accepting different input or output methods, keeping the processing separate makes it easier to test the processing. Lord knows I would have loved it if “Project: Lumbergh” had a single entry point for dealing with the “business logic”—it would have made testing so much easier than it was (but that's no longer my concern).
So even if you can't switch to a functional programming langauge, that doesn't mean you can't apply the principles above and get most of the benefits of functional programming in a non-functional language. And the more that you can apply the principles above, not only will it make it easier to reason about code ina non-functional language, I think it will make learning an actual functional programming language easier. You might also want to read “Writing Video Games in a Functional Style. This is where I picked up on these principles in the first place (and it's sad that James Hague isn't writing anymore, but perhaps he said all he needed to).
Discussions about this entry
Monday, November 28, 2022
This update took a bit longer than I expected
I once mentioned updating mod_litbook
to run under a later version of Apache.
I wanted to do that because I've been running two instances of Apache—a later version that reverse proxies back to Apache 1.3 which just runs mod_litbook
and nothing else,
just to save me the agony of porting the code at the time.
It only took me twelve years to locate the round tuit on my desk,
but hey,
better late than never.
I did do a mod_lua
version of mod_litbook
first,
based on the version running on my Gemini server.
With that
(twelve years after I first played with mod_lua
)
and two hours of time,
I was able to match the output from the original version
(nice!).
But it should be easy to update the actual mod_litbook
source code to the latest version of Apache,
right?
Right?
Yeah,
kind of.
it took two days,
but I got it updated.
I wasted the better part of a few hours trying to follow the instructions in the modules/examples
directory which turns out to be laughably out of date.
Once I found the proper guide things went much more smoothly—mostly function name changes and removing calls to obsolete functions.
I also decided to go ahead and use the Apache Portable Runtime API instead of standard C functions as long as I was updating the code.
Now I just have to install the latest version of Apache on my server.
Just.
Yeah.
Tuesday, November 29, 2022
Adventures in updating
Before I go to the trouble of installing the latest version of Apache,
I want to ensure my updates to mod_litbook
will compile on the lastest version of Apache.
I've been developing it using Apache 2.4.38,
a version from 2019
(and because I'm using mod_lua
it's vulnerable to CVE-2021-44790).
So I pull down the latest version
(as of this writing,
the latest stable version is 2.4.54)
and start compiling.
I then got a compilation error about a missing field in a structure definition. Great! I think. Just how much of my system will I have to have to upgrade? I start investigating and find something odd—said field not only exists, it exists in the source code for Apache! The very codebase I'm compiling. Yet, for some reason, the compiler thinks the field doesn't exist.
At this point, I was reminded of Sherlock Holmes: “When you have eliminated the impossible, whatever remains, however improbable, must be the truth.” So the compiler must be picking up the incorrect header from somewhere. And “somewhere” ended up being the normal location of all system wide headers. The Apache 2.4.38 versions must have been installed such that such that one could compile Apache modules outside of the Apache source code directories.
It was a matter of identifying all such headers and removing them.
Once I did that,
Apache 2.5.54 compiled cleanly,
along with mod_litbook
,
and it's now running fine on my development system.
So that's something else to keep in mind.
Sigh.
Wednesday, November 30, 2022
Pink bedsheets, for very non-existant values of “pink”
The “pink” bamboo sheets finally arrived at Chez Boca. These are the ones I paid $325 for because we “stained” them acidentally with some bed clothes. Bunny and I are both puzzled over these, because they certainly don't appear pink in any meaningful, or unmeaningful, way. We looked at them in both dim light and exceedingly bright light, (thanks to some old-school 8mm movie lights I still have) and they look white to us.
Yes, there were a few spots on the sheets, that, maybe, could be pink? Or grey? A more accurate color may be “definitely not white,” but what color, exactly, is hard to tell.
So, there it is. $325 non-pink pink bedsheets made out of bamboo.
Woot.
Thursday, Debtember 01, 2022
Minimum support for webmentions
I just now realized I've released a version of mod_blog
during the holiday season going back as far as 2016.
With that in mind,
and with the fact that I finally received my first webmention on my blog couple of days ago,
I have just released the latest version for this Christmas season.
The big change this release is that I now show webmentions per post,
even though I've only so far received one.
Hey, it's a start with the webmentions.
You can also see from the sidebar list I have, that I changed versioning schemes a few years back. I used to use semantic versioning but upon reflection, I didn't feel it's not really fit for applications and instead switched to a monotonic version number. While the code has changed dramatically over the past 23 years (come this Debtember 4th) the data format has not changed one bit. It's still the “one HTML file per entry, using the file system as database” scheme, which has worked quite well for me over the years.
Discussions about this entry
Friday, Debtember 02, 2022
You too, can make objectively the world's best pizza at home
I have a thing for Detroit style pizza from Buddy's. If it wasn't so expensive to ship from Detroit, I would definitely have it more often. So it was with great joy that a few weeks ago Adam Ragusea release a video about Detroit style pizza. I had even more joy when I saw him make it from scratch. It's simple, it's just dough (which you have to make because it's not your standard pizza dough), pepperoni, Wisconsin brick cheese (which looks like it's only available via the Intarwebs if you aren't in Wisconsin) tomato sauce, and several hours (to let the dough proof, and to heat the oven to its highest setting, which technically isn't hot enough, but it will do).
Easy. [Yeah, and if you want to spend the money on the ingredients and a full day to make it, be my guest. I won't be doing it. —Bunny]
The vultures that are private equity firms
Adam Conover's video on private equity firms was interesting, but I would have liked a better explanation of how they make money from bankrupting the firms they buy (aside from the fees they apparently charge for their “services”). I would think that would be rather counter-productive over the longer term.
And yes, I have some experience with private equity firms. When I worked at The Corporation, we were initially bought out by a larger company (but were left along for years for … um … reasons), then that company was bought out by a private equity firm. It was then when we sold off access to some critical databases we used to a competitor and leased the data back from them, which I'm sure this bought in a ton of money for the private equity firm itself directly. Indirectly, it most likely shifted expenses around for tax advantages for the next few years (like shifting capital expenses to operating expenses or something like that—I'm not an accountant though) until our contract with our competitor expired in a few years and it would become Somebody Else's Problem to deal with (I think the hope of the private equity firm was that they would no longer own us by then). We also suffered hiring freezes because we “never had enough money to hire anyone” (odd, that, because we made millions per month from our customer, the Oligarchic Cell Phone Company).
Eventually, we did become Someone Else's Problem when we were sold to a much larger firm (which I don't think had any influence on the large push for Enterprise Agile—that's entirely the fault of the original company that bought out the Corporation), so at the very least, we avoided the “bankruptcy outcome.” But I can't say it was a pleasant experience at the time though.
Sunday, Debtember 04, 2022
Late to the party
I've been blogging for 23 years as of today.
This is also the first day this blog is being served up via https:
.
All I had to do was just install the latest version of Apache on my server.
It took several days,
but I got the latest version of Apache compiled and installed on my server.
Yes,
I did it the hard way.
What better way of knowing how things work than doing it the hard way.
I then spent Saturday updating the configuration.
There were a few changes,
like NameVirtualHost
being deprecated,
and having to add “Protocols h2 h2c http/1.1
” and “Require all granted
”.
Once that was done and the new server was up and running,
then I dove into the whole “Encrypt All The Things!” rabbit hole
(I know, I know, 2015 called and said I was late to the party).
A recent post of mine made it to The Orange Site and fully half of the comments were about the disturbing lack of faith TLS I had.
Of course.
Fortunately,
Apache has a module to handle certificates from Let's Encrypt
(or others places that support the “certificate update dance” protocol).
Unfortunately,
there are subtleties not mentioned in the documentation.
Like the MDCACertificateFile
directive
(which I need for my setup—don't ask)
not being documented.
Or the fact that if you make any type of mistake
(like using the wrong domain name because you cut-n-paste the configuration from one host into another and forgot to make the domain name change,
or using “SSLEngine on
” in the wrong place,
or forgetting to add acme-tls/1
to the Protocols
directive)
everything goes pear shaped and Let's Encrypt will rate limit and … ugh.
I'm just lucky I have a few domains to practice on before enabling it for my main sites.
But I was able to finish in time for the 23rd anniversary of my blog and get that stupid little lock on my site.
You're welcome.
Monday, Debtember 05, 2022
“Until I'm rescued from this Chinese fortune cookie factory, I might as well make the best of it!”
I appear to have ths knack for getting odd Chinese fortune cookie fortunes. I've yet to get the “Help! I'm trapped in a Chinese fortune cookie factory!” but this is darned close:
At least this time it wasn't in French.
Wednesday, Debtember 07, 2022
Notes on an overheard conversation about locking the keys in the car
“Finally! I'm home!”
“Yes you are!”
“And you didn't answer your phone.”
“You didn't call!”
“Yes I did.”
“Oh! I see I did receive a call, but it was from a number not on my contact list. You know I don't answer those.”
“I was hoping you'd make an exception.”
“It's hard to make an exception when I don't know who is calling.”
“Sigh. I locked my keys in my car, and I had to walk home from Panera Bread.”
“Oh dear … ”
“Now I have to locate my spare key.”
“Oh, you mean this key.”
“Yes. Could you please drive me to my car now?”
Notes on configuring Apache mod_md
I've been tweaking my Apache configuration for the past two days,
trying to figure out what I need and don't need,
and these are just some notes I've collected on the process.
I'm using mod_md
for managing the secure certificates,
and there isn't much out on the Intarwebs about how a configuratin for a website should look like.
I can find plenty of pages that basically regurgitates the Apache documentation for mod_md
,
but nothing on how it all goes together.
So here's an annotated version of a configuration for one of my less important sites:
<MDomainSet www.flummux.org> MDCertificateAgreement accepted MDContactEmail sean@conman.org MDMember flummux.org MDRequireHttps temporary </MDomainSet>
The required stuff.
I've found that using MDomainSet
is much cleaner than MDomain
as I have multiple sites that I want to keep separated,
certificate wise.
I'm old-school when it comes to naming,
so I like using the “www” prefix and prefer that to be part of the canonical name for my domains.
I also support the plain domain name,
but only to redirect to the “www” version of the site.
If you are more hipster than I,
then just reverse the domain names.
I won't judge.
Given the push that “Encrypt All The Things!” has had,
especially from Google,
I'm expecting any month now for Google Chrome
(that has,
what?
An 85% usage rate on the Internet?)
to enable the Big Scary Error Messages on non-encrypted web requests,
so I might as well go ahead and start pushing the secure versions of my sites
(sigh—I really hate this bit,
but I think I'm in the minority on this),
thus the MDRequireHttps
setting.
I tried using permanent
on one of my test domains and I screwed myself over when I flubbed the mod_md
configuration—I can't even reach the site from my primary browser as it is now stuck for the next six months trying to reach the secure version which isn't running.
Yes,
I could fix this by cleaning out my cache,
but that's pretty much an “all-or-nothing” option,
and for a domain I almost never use,
I can live with that for now.
I also flubbed the configuration for that domain so bad,
that I have to wait for a month before I try obtaining a certificate again.
Sigh.
<VirtualHost 71.19.142.20:80> ServerName flummux.org Redirect permanent / http://www.flummux.org/ Protocols h2 h2c http/1.1 acme-tls/1 </VirtualHost> <VirtualHost 71.19.142.20:80> ServerName www.flummux.org Protocols h2 h2c http/1.1 acme-tls/1 </VirtualHost>
Because I'm doing the MDRequireHttps
directive,
I've found that this is all I need for the non-secure settings,
which also means I don't need to duplicate the actual server settings twice,
once for the non-secure version,
and again for the secure version.
The first block is there to redirect http://domain
requests to http://www.domain
requests.
I'm not redirecting directly to https:
here,
as the Apache documentation warns that the certificate renewal might now work.
And because I want the certificate renewal to work,
I added acme-tls/1
to the list of protocols supported.
<VirtualHost 71.19.142.20:443> SSLEngine On ServerName flummux.org Redirect permanent / https://www.flummux.org/ Protocols h2 h2c http/1.1 acme-tls/1 </VirtualHost>
This is just to redirect https://domain
requests to https://www.domain
requests.
I'm not sure if I really need the acme-tls/1
setting here,
but I'm not taking a chance with the certificate renewal.
It's not clear in the Apache documentation what would happen,
and given how long I have to wait if it messes up,
I'm not willing to test it.
<VirtualHost 71.19.142.20:443> SSLEngine on ServerName www.flummux.org ServerAdmin sean@conman.org DocumentRoot /home/spc/web/sites/www.flummux.org/htdocs AddHandler server-parsed .shtml AddOutputFilter INCLUDES .shtml AddOutputFilterByType DEFLATE text/html text/plain text/xml Protocols h2 h2c http/1.1 acme-tls/1 CustomLog /home/spc/web/logs/www.flummux.org combined-deflate FileETag MTime Size AddDefaultCharset UTF-8 DirectoryIndex index.cgi SetEnv LUA_PATH "/home/spc/web/sites/www.flummux.org/lua/?.lua" SetEnv LUA_CPATH "/home/spc/web/sites/www.flummux.org/lib/?.so" Header set Content-Security-Policy "style-src 'unsafe-inline'; script-src 'unsafe-inline' 'unsafe-eval' 'self'; default-src 'self';" ExpiresActive On ExpiresDefault "access plus 1 month" ExpiresByType text/html "modification plus 1 week" <Directory /home/spc/web/sites/www.flummux.org/htdocs> Options All AllowOverride None Require all granted </Directory> <Directory /home/spc/web/sites/www.flummux.org/htdocs/errors> Options -Indexes </Directory> ErrorDocument 404 /errors/404.shtml </VirtualHost>
And we finally get to the configuration for the site itself. Not much to say about this, except that the “Content-Security-Policy” header is annoying to get right, and I'm not sure how much benefit it brings, but hey, this is a test site so I'll have to see how it goes.
So that's pretty much how I'm setting up each site I host. It's pretty straightforward, except for the sheer terror that I've made a typo and will have to wait a month before trying to obtain a secure certifcate again. You have been warned.
Thursday, Debtember 08, 2022
Some comments on delimiter-first code
I was reading “Delimiter-first code” (via Lobsters) and I was struck by his first example of comma-first formatting:
-- leading commas SELECT employee_name , company_name , salary , state_code , city FROM `employees`
That doesn't look half bad, I thought. It could make for smaller diffs in some cases. For instance, I have this:
fprintf( stdout, "Status: %d\r\n" "X-Error: %s\r\n" "Content-type: text/html\r\n" "\r\n", level, errmsg );
Rework it to use comma-first formatting:
fprintf( stdout , "Status: %d\r\n" "X-Error: %s\r\n" "Content-type: text/html\r\n" "\r\n" , level , errmsg );
I still have to work within the confines of C,
but here it's easier to see that the string literal is one long literal and not four additional parameters,
so that's good.
It's a bit strange looking,
but I could get used to it
(I got used to “char const
” over “const char
” because const
applies to the object to its right,
except if starts the declaration; it makes parsing “char const *const p
” easier for me—this declares p
to be a constant pointer to constant data).
And if I need to add to it:
fprintf( stdout , "Status: %d\r\n" "X-Error: %s\r\n" "Content-type: text/html\r\n" "X-Foobar: %s\r\n" "\r\n" , level , errmsg , foobar );
the diff is easier to follow—this:
5a6 > "X-Foobar: %s\r\n" 8a10 > , foobar
instead of:
5a6 > "X-Foobar: %s\r\n" 8c9,10 < errmsg --- > errmsg, > foobar
But then I came across this bit of code I wrote:
XEvent se = { .xselection = { .type = SelectionNotify, .serial = NextRequest(event->xselectionrequest.display), .send_event = True, .display = event->xselectionrequest.display, .requestor = event->xselectionrequest.requestor, .selection = event->xselectionrequest.selection, .target = event->xselectionrequest.target, .property = event->xselectionrequest.property, .time = event->xselectionrequest.time, } };
And … um …
XEvent se = { .xselection = { .type = SelectionNotify , .serial = NextRequest(event->xselectionrequest.display) , .send_event = True , .display = event->xselectionrequest.display , .requestor = event->xselectionrequest.requestor , .selection = event->xselectionrequest.selection , .target = event->xselectionrequest.target , .property = event->xselectionrequest.property , .time = event->xselectionrequest.time } };
Yeah …
C99 has designated initializers and also allows trailing commas when initializing structures, so the need for comma-first formatting doesn't really apply here; comma-first formatting only really applies to function calls. Perhaps languages should allow trailing commas in all contexts? It's something to think about.
The rest of the article is really about marking items in a list with some delimiter, usually a comma that comes after an item (except for the last item). There's one example he brings up: “1 , 2 , 3” vs. “・1 ・2 ・3” and here, I would say maybe “1 2 3” is best? Using spaces instead of a comma could still work in a lot of contexts in C:
/* none of this is valid C code */ rc = cgi_error(blog req HTTP_BADREQ "bad request"); fprintf( stdout "Status: %s\r\nContent-type: text/html\r\n\r\n" status ); generic_cb("main" stdout callback_init(&cbd blog req));
It only breaks down when we go back to my first example above:
/* still not valid C code */ fprintf( stdout "Status: %d\r\n" "X-Error: %s\r\n" "Content-type: text/html\r\n" "\r\n" level errmsg );
Consecutive string literals are collected together into a single string literal, so such a construct as above could lead to some confusion. But this is just me riffing on using space as a delimiter.
The rest of the article does lay out a decent argument for leading delimiters for a lot of contexts, but removing closing brackets I think is too far. It works for Python because of syntactic white space, but it won't work for nearly any other language. It also fails for languages that support variadic functions, so it's probably best to keep both opening and closing brackets (or parenthesis or whatever). It also seems the arguments are more for vertical than horizontal formatting.
The article ends with:
Don’t be too surprised if this proposal evokes “hey this looks wrong, just plain wrong” reaction. After all, ideas we enjoy these days: enumeration from zero, using registers in names, structural programming, mandatory formatting, and even python’s approach to defining code blocks with indentation — every single one of them were met with a storm of criticism.
I'll keep that in mind, but even so, not everyone buys into mandatory formatting or significant white space.
Friday, Debtember 09, 2022
Notes on an overheard conversation as the radio was playing “Blue Christmas”
“Oh! You know who that is, right?”
“A very bad Elvis impersonator.”
“No! It's Dean Martin.”
“Really? I didn't know the Mad Magazine artist had a singing career as a bad Elvis impersonator.”
“It was first recorded in 1948 by Doye O'Dell—”
“He too, was probably a bad Elvis impersonator.”
“And the following year by Ernest Tubb, Hugo Winterhalter, and Russ Morgan.”
“All bad Elvis impersonators! All of them!”
“Elvis didn't even record it until 1957!”
“They were just early with their Elvis impersonations.”
“Sigh.”
“When out from the bathroom there arose such a clatter, she sprang from the bed to see what was the matter.”
“Oh ffffffffffffuuuuuuuuddddddddddge!”
Only I didn't say “Fudge.” I said the word, the big one, the queen mother of dirty words, the “F-dash-dash-dash” word. Fortunately, the loud crashing sound masked what I said. It also brought Bunny to the bathroom door.
“Are you alright?” she asked from the other side.
“Yes,” I said, hobbling to the door, trying to keep my balance as I was sopping wet with a plastic garbage bad covering my right foot. “although I did do a number on the garbage pail.” I then opened the door to let Bunny see the resulting carnage.
“What happened?”
“I was trying to get out of the tub and slipped,” I said, pulling the garbage bag off my foot.
“Oh! You're bleeding!”
“Tis a flesh wound,” I said. “I've had worse.”
“Sean, you're lucky you didn't smash your head open. Those bathtubs have been known to kill people.”
I should have made a check list
Yup.
I messed up again,
just as I was afraid of.
Using mod_md
isn't that hard,
it's just that any mistake you make means you just lost a few days,
up to an entire month.
Sigh.
It's a bit late now, but I should have created this check list to help prevent mistakes:
- Figure out primary domain name (aka primary)
- Figure out alias domain name (aka alias)
- Configure
MDomainSet
-
<MDomainSet primary>
- Make sure primary is spelled correctly
-
MDCertificateAgreement accepted
-
MDContactEmail sean@coman.org
-
MDMemer alias
- Make sure alias is spelled correctly
-
MDRequireHttps temporary
-
</MDomainSet>
-
- Configure
VirtualHost alias:80
-
<VirtualHost ip:80>
-
ServerName alias
- Make sure alias is spelled correctly
-
Redirect permanent / http://primary
- Make sure primary is spelled correctly
-
Protocols h2 h2c http/1.1 acme-tls/1
-
</VirtualHost>
-
- Configure
VirtualHost primary:80
-
<VirtualHost ip:80>
-
ServerName primary
- Make sure primary is spelled correctly
-
Protocols h2 h2c http/1.1 acme-tls/1
-
</VirtualHost>
-
- Configure
VirtualHost alias:443
-
<VirtualHost ip:443>
-
SSLEngine on
-
ServerName alias
- Make sure alias is spelled correctly
-
Redirect permanent / https://primary
- Make sure primary is spelled correctly
-
Protocols h2 h2c http/1.1 acme-tls/1
-
</VirtualHost>
-
- Configure
VirtualHost primary:443
-
<VirtualHost ip:443>
-
SSLEngine on
-
ServerName primary
- Make sure primary is spelled correctly
-
Protocols h2 h2c http/1.1 acme-tls/1
-
</VirtualHost>
- Other configuration settings …
-
My last mistake?
I forgot to add acme-tls/1
to the Protocols
directive.
Aaaaaaah!
It's not that I haven't done check lists before, and they're great at making sure you don't miss a step—I just have to remind myself to do them. But better late than never, as I can use this the next time I have to add a new domain.
Monday, Debtember 12, 2022
How I feel about HTTPS
My recent postings on using HTTPS for my sites reminded one of my readers, White_Rabbit, to send in a link to Discourse on HTTPS. The language may be salty, but it does align with my feelings towards HTTPS—namely, I don't really need it. But as I stated, Google will any day now start with the Big Scary Error Messages on non-secure sites, followed by (possibly—I don't know this for a fact, but a gut feeling) no longer allowing non-secure requests at all. And with Google's Chrome having a ridiculous market share, that's something to be concerned about.
Tuesday, Debtember 13, 2022
I think this toilet is going to be the death of us
It started yesterday when, after flushing the toilet, I noticed water seeping all around the toilet bowl. “This is not good,” said Bunny, as she inspected the growing puddle of water. “Let's cut off the water to this thing, and deal with it tomrrow. Looks like we're going to have to replace the wax ring.”
Cut to—today. Water disconnected, I pull the toilet off the floor revealing the horrible remains of a wax ring. Bunny then scrapped the remains up, and we replaced the wax ring with a non-wax ring that should last longer. We get the toilet back in place, secured it down, hooked the water up and hey! Looks like no more water.
Until there was.
It appears that it may not have been the wax ring at all, but the seals around the … um … bit (I have no idea what it's called) that regulates the water into the tank. Water is pouring out of the tank at the location the water pipe is connected to the toilet.
I swear, this toilet is cursed!
Notes on an overheard conversation in the bathroom
“I think we're finally done! I think this toilet should last years.”
“Well, the last time we worked on it was in 2018.”
“How do you know?”
Wednesday, Debtember 14, 2022
An annotated example of using LPeg to parse a string to generate LPeg to parse other strings
A message on the Lua email list was asking about the best way to parse MQTT topics, specifically, how to handle the multilevel wildcard character. I answered that LPeg would be good for this, and gave annotated source code to show how it works. I thought I might also post about it for better visibility.
So, here's the code:
local lpeg = require "lpeg" local Cc = lpeg.Cc local Cf = lpeg.Cf local P = lpeg.P local R = lpeg.R local filter do local separator = P'/' local topic = R("AZ","az","09")^1 * (#separator + P(-1)) local single = P'+' * (#separator + P(-1)) local multi = (P"/#" + P'#') * P(-1) local csep = separator / function() return separator end local ctopic = topic / function(c) return P(c) end local csingle = single / function() return topic^-1 end local cmulti = multi / function() return (separator * topic)^0 * P(-1) end filter = (P"#" * P(-1)) / function() return (separator^-1 * topic)^0 * P(-1) end + Cf( (-P"/#" * (ctopic + csingle + csep))^0 * cmulti^-1 * Cc(P(-1)), function(a,r) return a * r end ) * P(-1) end
And now the annotations—code fragment first, then annnotation.
local lpeg = require "lpeg" local Cc = lpeg.Cc local Cf = lpeg.Cf local P = lpeg.P local R = lpeg.R
This loads the LPeg module into Lua.
I also grab the functions I'll be using from the module into locals.
I do this not for speed purposes
(although it will be slightly faster)
but to reduce code clutter—there will be less lpeg.
littered about the code,
and I find that easier to read personally.
It's not required that this be done.
local filter do -- ... end
filter will contain the resulting LPeg expression. I create a new scope since the variables I'll be declaring won't be used outside of the definition for filter and it seems cleaner to me to reduce variable visibility as much as possible. It will also mean that over time (if this is intended for code that runs for a long time) the local variables created in this scope will be reclaimed as garbage. It's just a stylistic choice I do for Lua.
local separator = P'/'
This defines an LPeg expression that matches a literal slash character. The P() function can do a bit more than match literal strings, but we'll be mostly using it for literal string matches, as well as matching the end of the input string.
local topic = R("AZ","az","09")^1 * (#separator + P(-1))
This expression will match a “topic.” I'm using R() to match a range of characters (in this case, letters and digits). The multiplication sign (okay, an asterisk, but it's used to designate multiplication in Lua) here is used as an “AND” clause—a topic is a range of characters “AND” something else. That “something else” is either a separator (and the “#” mark is used to look ahead in the input without consuming it) or (the plus sign is read as “OR”) end of the input string.
local single = P'+' * (#separator + P(-1))
This expression will match a plus sign, which is used to indicate a single topic wildcard character. And again, we're expecting this to be followed by a separator character or the end of the string.
local multi = (P"/#" + P'#') * P(-1)
The “#” charcter is a multiple topic wildcard character and it must appear at the end of the string. I check for both “/#” and “#” because of a way I process the input later on. I might have found a better way to deal with this, but for a “proof-of-concept” this is good enough for now.
Now we get to the mind bending bit of this—I'm writing LPeg to parse a “topic filter” and generate an LPeg expression that will see if a “topic name” matches the “topic filter.”
local csep = separator / function() return separator end local ctopic = topic / function(c) return P(c) end local csingle = single / function() return topic^-1 end local cmulti = multi / function() return (separator * topic)^0 * P(-1) end
These four expressions all do similar things—they match an existing pattern and pass the matching text to a function which returns an LPeg expression. csep returns an expression that matches the separator; ctopic returns an expression that matches the literal topic just parsed; csingle returns an expression that matches an alphanumeric string that represents a topic; and finally cmulti returns an expression that matches the remaining input.
And the final bit of code:
filter = (P"#" * P(-1)) / function() return (separator^-1 * topic)^0 * P(-1) end + Cf( (-P"/#" * (ctopic + csingle + csep))^0 * cmulti^-1 * Cc(P(-1)), function(a,r) return a * r end ) * P(-1)
The first line just matches a single multiple topic character and returns a pattern that will match the input.
If that doesn't match
(remember, “+” is read as an “OR”)
we do a folding capture
(Cf())—the code parses through the “topic filter” and builds an LPeg expression using a folding capture that will parse “topic names“ per the filter.
Each piece that does match and return a capture will be “accumulated” into a single expression,
the “folding” being done by the anonymous function passed in.
The -P("/#")
bit there looks ahead in the input to make sure it isn't a multiple topic wildcard character at the end of the string,
and if that is the case,
then it will compile a match for a literal topic, a non-specified topic
(which fulfills the “single wildcard character match”)
or a separator
(but as long as the separator isn't itself followed by a multiple topic wildcard character,
which is why we peek forward into the input).
If we get to a point in the input where we either hit the end of input,
or a multiple topic wildcard character,
we handle that and cap the LPeg expression we're building with checking for end of the input
(all on lines 4–7).
The point of all this is to turn a string like “+/tennis/+/#” into the following LPeg expression:
topic * separator * P"tennis" * separator * topic * (separator * topic)^0 * P(-1)
which can then be used to match “topic names:”
local topics = { "news/tennis", -- won't match "news/tennis/mcenroe", -- will match "news/football/dolphins", -- won't match "news/baseball/marlins", -- won't match "sports/tennis/williams/ranking", -- will match } local the_topic = filter:match("+/tennis/+/#") for _,topic in ipairs(topics) do if the_topic:match(topic) then report_on_it(topic) end end
Yes,
there is a learning curve
(okay,
maybe a cliff)
to LPeg.
But once you get used to it,
it is quite powerful and allows you to transform data in ways that you can't with regular expressions.
In fact,
instead of returning the default value
(which is one past the position of the match in the string,
or nil
if it failed to parse)
I could have instead returned an array of topics
(or nil
if it failed to parse)—but I will leave such changes as an exercise to the reader.
There's also a bit about dollar signs further down in the MQTT document I linked to, but again, handling that is left as an exercise for the reader.
Discussions about this entry
Thursday, Debtember 15, 2022
Notes on an overheard conversation while bringing the garbage can up from the street
“Oh! We got another Christmas card!”
“Cool! Who is it from?”
“It's from XXX.”
“Wait! He mailed it? He actually used a stamp?”
“Yes.”
“He lives across the street!”
“That reminds me, I have to mail him his card.”
“And you're going to use a stamp to mail it to him?”
“Yes!”
“Why not just walk it across the street and put it in his mailbox?”
“Because it's tradition. And isn't it illegal for civilians to put items into a mailbox they don't own?”
“Oy vey.”
Re: Conformance Should Mean Something - fputc, and Freestanding
Well, that’s okay, because I’m not one to just sit on my hands no matter how much silence I’m met with or how much crippling depression is running through my system: I reached out to a few folks who I knew worked on MISRA, met with them, and thankfully they brought it up in their group meeting on my behalf. Even if the Committee doesn’t want to / feel like commenting (and to be perfectly clear, they do not have to comment; it’s not like I wrote a paper and nobody owes me nothin’, Jack, including a response to my e-mail anyhow), at least MISRA could bring some clarity, right? They work with a ton of implementations, especially embedded/freestanding implementations, and so they should be able to give me good feedback. I contacted an implementer I have the utmost of faith in who attends MISRA functions, so they could bring the issue up at a meeting. They sort of hashed it out. People for/against the code snippet above, whether 2 could be returned validly, and whether what TI’s Run-Time Support Library was doing was standards-blessed behavior (ignoring any “Freestanding” weasel- ing)…
there was divergence on whether or not the snippet was illegal.
It is a little concerning that the body responsible for figuring out the dusty corners of the C standard and guaranteeing portable behavior are not sure if (a) they like what the code snippet implies or (b) which direction of implication they’d like it to go in. But, on the other hand, they are at least united in that some clarity around the subject would be helpful and that we should make it clear what we mean in these functions and in the specification. They’re sort of on top of moving the needle to make sure we are writing high-quality code that can stand the test of time, and “fwrite may not portably do what you want and you need to write a wrapper function before using it every time” needs to be something they should be keen on agreeing on before we can move forward with using basic file abstractions for C. Of course, this is the human-based, common, and shared understanding I was being told about before that would lead us to Nirvana, and what I’m unfortunately finding is that it’s not actually all that bound together in harmony.
Via Hacker News, Conformance Should Mean Something - fputc, and Freestanding | The PastureConformance Should Mean Something - fputc, and Freestanding | The Pasture
It is a mess. The code from the blog post works on most systems, but most systems these days use 8-bit characters; the article is about systems where a character is defined as 16-bits (allowed by the C Standard) and where an integer is also 16-bits (again, allowed by the C Standard and is the minimum size an integer can be per the C specification). It's rare to have non-8-bit characters on desktop computers these days (or even tablet and smart phones) but it seems it's not quite that rare in the embedded space, where you have DSPs that have weird architectures and a charater is most likely the same size as an integer. And that's where the trouble starts.
The main issue is with fputc(). The C Standard states:
The fputc function
Synopsis
#include <stdio.h>
int fputc(int c,FILE *stream);
Description
The fputc function writes the character specified by c (converted to an
unsigned char
) to the output stream pointed to by stream, at the position indicated by the associated file position indicator for the stream (if defined), and advances the indicator appropriately. If the file cannot support positioning requests, or if the stream was opened with append mode, the character is appended to the output stream.Returns
The fputc function returns the character written. If a write error occurs, the error indicator for the stream is set and fputc returns
EOF
.
If both char
and int
are the same size, then
this function can't work as is. The function assumes that the size
of int
is larger than the size of a char
, thus any
value of a signed or unsigned char
can be converted into an
int
or an EOF
, (a value unrepresentable as a
char
).
If char
and int
are the same size … yikes!
And from reading the blog post, it seems that most embedded systems will
clamp down on the values written by fputc() to be between 0 and
255, regardless of what you pass in, even when characters can be 16
bits in size. This is probably to remain interoperable with the rest of the
world where char
is 8-bits in size (Unicode
notwithstanding).
I'm also not sure about this bit from the blog post about fwrite(): “Okay, so it will loop and call through fputc. This is covered under the as-if wording, so it’s not like your standard library has to write exactly a loop of fputc.” I checked the standard, and it always mentions “as if” explicitly, like “this International Standard treats such an end-of-line indicator as if it were a single new-line character” (emphasis added) or “The implementation shall behave as if no library function calls the setlocale function.” (again, emphasis added). But no where is it mentioned in releation to fwrite().
Here's the C89 Standard on fwrite():
The fwrite function
Synopsis
#include <stdio.h>
int fwrite(const void * ptr,size_t size, size_t nmemb, FILE * stream);
Description
The fwrite function writes, from the array pointed to by ptr, up to nmemb elements whose size is specified by size, to the stream pointed to by stream. The file position indicator for the stream (if defined) is advanced by the number of characters successfully written. If an error occurs, the resulting value of the file position indicator for the stream is indeterminate.
Returns
The fwrite function returns the number of elements successfully written, which will be less than nmemb only if a write error is encountered.
It's the C99 standard that added the sentence about calling fputc() (which I highlighted below):
The fwrite function
Synopsis
#include <stdio.h>
int fwrite(const void * restrict ptr,size_t size, size_t nmemb, FILE * restrict stream);
Description
The fwrite function writes, into the array pointed to by ptr, up to nmemb elements whose size is specified by size, from the stream pointed to by stream. For each object, size calls are made to the fputc function, taking the values (in order) from an array of
unsigned char
exactly overlaying the object. The file position indicator for the stream (if defined) is advanced by the number of characters successfully written. If an error occurs, the resulting value of the file position indicator for the stream is indeterminate.Returns
The fwrite function returns the number of elements successfully written, which will be less than nmemb only if a write error is encountered. If size or nmemb is zero, fwrite returns zero and the state of the stream remains unchanged.
And nary an “as-if” in sight.
I have to wonder why that sentence was added to C99, if not to force calls to fputc(). I supposed the C Standards Comittee had a reason for it, and I don't think they would have omitted the “as if.” If they did, they failed to add it to the C11 and the proposed C2x standards. So I'm not sure if an implementation of fwrite() can avoid calling fgetc().
And unrelated to this post, I did come across this lovely footnote in the C99 standard:
Setting the file position indicator to end-of-file, as with
fseek(file, 0, SEEK_END)
, has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.
Seriously?
It's not even “implementation defined?” Because that sounds like an implementation detail (for example, on CP/M). But undefined? Come on!
Worse, it's not even listed in “Appendix J.2 Undefined behavior.”
Friday, Debtember 16, 2022
You know what would be refreshing? Vintage ads for Saturnalia
I like vintage Coca-Cola ads, and a lot of our modern view of Santa Claus comes from said vintage Coca-Cola ads, but I'm not sure what I make of this holiday display from the neighborhood:
I do think Coke has made bank on such advertising, because not only is someone in my neighborhood hawking Coke from nearly centry old ad, but now I'm shilling for Coke for displaying my neighbor's display of a nearly centry old ad. Nice play, Coke.
Monday, Debtember 19, 2022
Santa Claus, Coca-Cola, Sprite, and vast amounts of cookies
I was wrong when I mentioned that Coca-Cola created our modern image of Santa Claus— it goes back into the mid-to-late 1800s. But Coca-Cola's Santa Claus advertising might have been the inspiration for Sprite.
And speaking of Santa and food products, here's a video that answers the question no one bothered to ask, how many cookies does Santa Claus consume on Christmas? It's amazing he even survives the trip.
Wednesday, Debtember 21, 2022
Unit test this
Or, “What is a ‘unit test,’ part II”
I saw a decent answer to my question which makes sense for C. Another decent (if a bit vague) answer was:
So to answer Sean's question, a unit test is that which requires the least amount of work to setup but is able to reduce the need for coverage of the larger system. Whatever you want to consider a "unit" is up to you and the language you're using.
I left off my previous entry pointing to a function that I would love to have seen someone else “unit test,” but alas, no one did. But I always had plans on going all “The Martian” on the code and “unit test the XXXX out of it.”
So here's the code in question:
/*********************************************** * * Copyright 2021 by Sean Conner. All Rights Reserved. * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version 2 * of the License, or (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. * * Comments, questions and criticisms can be sent to: sean@conman.org * ************************************************/ #include <stdbool.h> #include <stdlib.h> #include <string.h> #include <errno.h> #include <assert.h> #include <sys/types.h> #include <sys/wait.h> #include <sys/stat.h> #include <unistd.h> #include <fcntl.h> #include <syslog.h> #include <sysexits.h> /************************************************************************/ bool run_hook(char const *tag,char const *argv[]) { assert(tag != NULL); assert(argv != NULL); assert(argv[0] != NULL); pid_t child = fork(); if (child == -1) { syslog(LOG_ERR,"%s='%s' fork()='%s'",tag,argv[0],strerror(errno)); return false; } else if (child == 0) { extern char **environ; int devnull = open("/dev/null",O_RDWR); if (devnull == -1) _Exit(EX_UNAVAILABLE); if (dup2(devnull,STDIN_FILENO) == -1) _Exit(EX_OSERR); if (dup2(devnull,STDOUT_FILENO) == -1) _Exit(EX_OSERR); if (dup2(devnull,STDERR_FILENO) == -1) _Exit(EX_OSERR); for (int fh = STDERR_FILENO + 1 ; fh <= devnull ; fh++) if (close(fh) == -1) _Exit(EX_OSERR); execve((char *)argv[0],(char **)argv,environ); _Exit(EX_UNAVAILABLE); } else { int status; if (waitpid(child,&status,0) != child) { syslog(LOG_ERR,"%s='%s' waitpid()='%s'",tag,argv[0],strerror(errno)); return false; } if (WIFEXITED(status)) { if (WEXITSTATUS(status) != 0) { syslog(LOG_ERR,"%s='%s' status=%d",tag,argv[0],WEXITSTATUS(status)); return false; } } else { syslog(LOG_ERR,"%s='%s' terminated='%s'",tag,argv[0],strsignal(WTERMSIG(status))); return false; } } return true; } /************************************************************************/
As you can see, it's one function, in one file, with the only dependencies
being the operating system. So this should be the “perfect unit” to write
some “unit tests” for. The code does replicate a bit of the standard C
function system()
, so why not use system()
in the
first place? The answer comes from the manual page for Linux:
Do not use
system()
from a privileged program (a set-user-ID or set-group-ID program, or a program with capabilities) because strange values for some environment variables might be used to subvert system integrity. For example,PATH
could be manipulated so that an arbitrary program is executed with privilege. Use theexec(3)
family of functions instead, but notexeclp(3)
orexecvp(3)
(which also use thePATH
environment variable to search for an executable).
This function runs as part of a set-user-ID program (mod_blog
in particular,
for reasons beyond the scope of this entry) so no system()
for
me. Also, I avoid having to construct a command string that might have failed
to properly escape the filename to avoid complications with the shell's use
of certain characters. And it's not like the function was hard for me to
write. I've done functions like this before, and it worked the first time
without issue when I wrote it (and the small changes to it since have been a
simplification of the parameters, and changes to the logging messages). It's
also not a very long function (I'm sorry Ron Jefferies, but 14
lines of code isn't “a lot of code”).
The reason I wanted some unit test proponents to look at this code is that it involves quite a bit of interaction with the operating system in C, a not-very-popular programming language these days, and I was curious as to the level of “unit testing“ that would be done. No bites, but my gut feeling is that a “unit test proponent” faced with this code would just throw two scripts to it, one to return successfully:
int main(void) { return 0; }
and one to return failure:
int main(void) { return 1; }
and call it “battle tested.” The two test cases themselves are pretty easy to write:
#include <stdbool.h> #include <stdlib.h> #include <stdio.h> #include "tap14.h" extern bool run_hook (char const *,char const **); int main(void) { tap_plan(2,NULL); tap_assert( run_hook("script",(char const *[]){ "./success" , NULL }),"success script"); tap_assert(!run_hook("script",(char const *[]){ "./failure" , NULL }),"failure script"); return tap_done(); }
(I'm using my own testing framework based on TAP. I wrote my own to be as minimal as possible to get the job done—other TAP frameworks for C I looked at were too overblown for my tastes.)
An improvement would be to add a test script that terminates via a signal. It's again easy enough to write that script:
#include <signal.h> int main(void) { raise(SIGINT); return 1; }
and the appropriate test:
tap_assert(!run_hook("script",(char const *[]){ "./terminate" , NULL }),"terminate script");
But that only tests half the code. How do I know I didn't mess up the codepath in the child process before I execute the script in question? At The Enterprise, it was expected our tests cover about 70% of the code at least— I'm short of that target here. And as I say, I'm aiming to “unit test the XXXX out of this” and get 100% code coverage, because shouldn't that be the goal of “unit testing?”
But to achieve that target, I'm going to have to deal with “failing” a bunch of existing functions, and according to my interprestation of “A Set of Unit Testing Rules,” if I'm not mocking, I don't have a “unit test.” So I have to mock some system calls.
And here is where I hit a problem—to do so will invoke the dreaded
“undefined behavior of C.” Seriously–if I provide my own function for, say,
dup2()
, I am technically invoking undefined behavior of the C
machine (this incredibly long flame war on the Lua mailing list of
all places, goes into the gory details behind that statement). Now granted,
certain tool chains on certain operating systems allow one to override
functions, but you can't rely upon this behavior in general. Given that I'm
doing all of this on Linux, and Linux in general allows this, I can proceed
(carefully) with mocking system functions.
That should be straightforward enough. The mocked open()
function:
static int err_open; static int ret_open; int open(char const *pathname,int flags) { (void)pathname; (void)flags; if (err_open != 0) errno = err_open; return ret_open; // XXX had bug here }
This should be fine for my purposes as I don't actually need to read from the file. If I really needed to call into the original function, this might work:
static int err_open; int myopen(char const *pathname,int flags) { if (err_open == 0) return open(pathname,flags,0); errno = err_open; return -1; } #define open myopen
But as the “A Set of Unit Testing Rules” article states, “A test is not a
unit test if: it touches the file system.” So the above isn't a “true mock,”
and I shall continue with my “true mocked” function instead. I can continue
with similar implementations for the functions dup2()
,
close()
and waitpid()
. Unfortunately, there are
three functions that may present some interesting challenges:
fork()
, execve()
, and _Exit()
. The
first returns twice (kind of—if you squint and look sideways), the second
only returns if there's an error, and the third never returns.
Now looking over the implementation of the function I'm testing, and
thinking about things, I could do a similar implementation for
fork()
—the returning twice thing is where it returns once in the
parent process, and once in the child process, but I can treat that as just a
normal return, at least for purposes of testing. For execve()
, I
can only test the error path here as the script being “run” isn't being run.
That just leaves _Exit()
as the final function to mock. And for
that one, I wrap the entire call to run_hook()
(the function
being “unit tested”) around setjmp()
and longjmp()
to simulate the not-returning aspect of _Exit()
. So a test of
the close()
codepath would look like:
static bool X(char const *tag,char const *argv[]) { volatile int rc = setjmp(buf_exit); if (rc != 0) return false; return run_hook(tag,argv); } int main(void) { /* ... */ ret_open = 4; err_dup2 = 0; ret_dup2 = 0; bad_dup2 = -1; err_close = EIO; ret_close = -1; tap_assert(!X("script",(char const *[]){ "./success" , NULL }),"close() fail"); /* ... */ return tap_done(); }
I got all the test cases written up and all 11 tests pass:
TAP version 14 1..11 ok 1 - success script ok 2 - failure script ok 3 - terminate script ok 4 - fork() fail ok 5 - open() fail ok 6 - dup2(stdin) fail ok 7 - dup2(stdout) fail ok 8 - dup2(stderr) fail ok 9 - close() fail ok 10 - execve() fail ok 11 - waitpid() fail
A successful “unit test” with 100% code coverage. But I'm not happy with this. First off, I don't get the actual logging information for each test case. All I get is:
Dec 21 19:34:10 user err /dev/log test_run_hook script='./failure' status=1 Dec 21 19:34:10 user err /dev/log test_run_hook script='./terminate' terminated='Interrupt' Dec 21 19:34:10 user err /dev/log test_run_hook script='./success' fork()='Cannot allocate memory' Dec 21 19:34:10 user err /dev/log test_run_hook script='./success' waitpid()='No child processes'
and not
Dec 21 19:04:10 user err /dev/log test_run_hook script='./failure' status=1 Dec 21 19:04:10 user err /dev/log test_run_hook script='./terminate' terminated='Interrupt' Dec 21 19:04:10 user err /dev/log test_run_hook script='./success' fork()='Cannot allocate memory' Dec 21 19:04:10 user err /dev/log test_run_hook script='./success' status=69 Dec 21 19:04:10 user err /dev/log test_run_hook script='./success' status=71 Dec 21 19:04:10 user err /dev/log test_run_hook script='./success' status=71 Dec 21 19:04:10 user err /dev/log test_run_hook script='./success' status=71 Dec 21 19:04:10 user err /dev/log test_run_hook script='./success' status=71 Dec 21 19:04:10 user err /dev/log test_run_hook script='./success' status=69 Dec 21 19:04:10 user err /dev/log test_run_hook script='./success' waitpid()='No child processes'
(And no! I am not checking that
syslog()
got the right message in the test cases—been there,
done that and all I got was a stupid tee-shirt and emotional scars. It's easy
enough to just manually check after the test runs, at least for this
entry.)
It just doesn't feel right to me that I'm testing in a faked
environment. No, to get a better “unit test” I'm afraid I'm going to have to
keep invoking undefined C behavior that is allowed by Linux, and interpose
our functions by using LD_PRELOAD
to override the functions. And
I can set things up so that I can still call the original function when I
want it to succeed. So all that needs to be done is write a shared object
file with my versions of the functions, and include this function:
static pid_t (*___fork) (void); static int (*___open) (char const *,int,mode_t); static int (*___dup2) (int,int); static int (*___close) (int); static int (*___execve) (char const *,char *const [],char *const []); static pid_t (*___waitpid)(pid_t,int *,int); __attribute__((constructor)) void init(void) { ___fork = dlsym(RTLD_NEXT,"fork"); ___open = dlsym(RTLD_NEXT,"open"); ___dup2 = dlsym(RTLD_NEXT,"dup2"); ___close = dlsym(RTLD_NEXT,"close"); ___execve = dlsym(RTLD_NEXT,"execve"); ___waitpid = dlsym(RTLD_NEXT,"waitpid"); }
(I include all three parameters to open()
even though the
last one is optional—I don't want to have to deal with the variable argument
machinery with C—this should work “just fine on my machine”—I'm already into
territory that C formally forbids. I'm using triple leading underscores
because single and double leading underscores are reserved to the C compiler
and implementation, but nothing is mentioned about three leading
underscores.)
Now, how to get information to my replacement functions about when to
fail. I thought about it, and while there is a way to do it with global
variables, it gets complicated and I'd rather do this as simply as possible.
I figured I could sneak variables through to my replacement functions via
putenv()
, getenv()
and unsetenv()
.
This will make the close()
failed test case look like:
putenv((char *)"SPC_CLOSE_FAIL=5"); /* EIO */ tap_assert(!run_hook("script",(char const *[]){ "./success" , NULL }),"close() fail"); unsetenv("SPC_CLOSE_FAIL");
And the corresponding close()
function is:
int close(int fd) { char *fail = getenv("SPC_CLOSE_FAIL"); if (fail == NULL) return (*___close)(fd); errno = (int)strtoul(fail,NULL,10); return -1; }
The other functions work simularly, and when run:
TAP version 14 1..11 ok 1 - success script ok 2 - failure script ok 3 - terminate script ok 4 - fork() fail ok 5 - open() fail ok 6 - dup2(stdin) fail ok 7 - dup2(stdout) fail ok 8 - dup2(stderr) fail ok 9 - close() fail ok 10 - execve() fail ok 11 - waitpid() fail
More importantly, since the functions can actually function as intended when I don't want them to fail, I get the full output I expect in the system logs. But per the “A Set of Unit Testing Rules” article, this is no longer a “proper unit test.”
I don't know. The more I try to understand “unit testing,” the less sense it makes to me. There is no real consensus as to what a “unit” is, and it seems strange (or in my more honest opinion, stupid) that we as programmers are not trusted to write code without tests, yet we're trusted to write a ton of code untested as long as such code is testing code. As I kept trying to impart to my former manager at The Enterprise before I left, the test case results aren't to be trusted as gospel (and it always was by him) because I didn't fully understand what the code was supposed to do (because the business logic in “Project Lumbergh” has become a literal mess of contradictory logic and communication among the team seriously broke down).
So maybe we're not supposed to “unit test” functions that involve input, output, or system interactions. Maybe we're supposed to “unit test” more “pure functions” and leave messy real world details to, oh, I don't know, some other form of testing. Okay, I have one final function that should be perfect for “unit testing.”
We shall see …
Discussions about this entry
Thursday, Debtember 22, 2022
It's kind of sad to think that the cheapest gift are the milk maids
It wasn't until I read this article from the Transylvania Times that I thought about the price of all the gifts from “The Twelve Days of Christmas.” It was also the first time I learned about The Christmas Price Index, where all this is tracked every year. This year's index, if you were to buy all the gifts mentioned in the song, comes to a staggering $197,071.09. And for all that, the 40 maids a-milking will only cost you $290. Not mentioned is the cleanup costs of all the gifts.
Friday, Debtember 23, 2022
Notes on an overheard conversation as the radio was playing “Winter Wonderland”
“Oh, that's Wayne Newton!”
“Wayne Newton?”
“Yes.”
“Are you sure that's not Eartha Kitt?”
“Yes dear, Eartha Kitt has a much lower singing register.”
Extreme tiny house, Asheville edition
“That is not a tiny house,” said Bunny.
“But it is, it's only 480 square feet.” [45 square meters —Editor]
“It feels big.”
“It does, and the design is wonderful.”
We were talking about this $80,000 home in the Ashville, North Carolina area. While it's technically a tiny house, it manages to feel big (living room, kitchen, bathroom, bedroom and recording studio), while being one of the more beautiful examples of a home I've seen (although we were not fans of the alternating tread stair cases, we do understand why they were used). You would never guess it was made from mostly recycled and unused materials. It's just gorgeous.
Some alternative do-it-yourself keyboards
I'm always fascinated by alternative keyboards, especially when they're hand made. Matthew Dockrey has made two of them. The first is based on old print technology, the two-thirds keyboard, which involved creating his own keycaps. And then there is his pocket typewriter, which is exactly what it is—a manual typewriter that fits in your pocket. It's mad stuff, but it's fantastic at the same time.
“Outdoors is currently not heated”
From my friend Tom , who posted this on MeLinkedMyInstaFaceSpaceBookWeGramIn, a TV sports caster forced to report on the weather. He makes his opinion on the weather (in a live report) loud and clear. I can only hope he keeps his job.
Monday, Debtember 26, 2022
It's not a “security hole,” it's a “privacy hole” and I don't think it's anything to worry about
I found a reference to the following in my notes from May this year—I suppose better late than never. Anyway …
The Potential Security Hole
…
Imagine a scenario where Big Tech does a massive marketing campaign in an attempt to mainstream the protocol. As part of their marketing, they could try to sell the idea of a Big Proprietary browser, or even add Gemini support directly into their existing web browser. Then they start a disinformation campaign to demonize the wide range of existing clients. Normies, naturally, would buy that without question, as they do. At that point, Big Tech could simply have their browser automatically generate a client certificate for every user and attach it to every request.
Couple this with some server side analytics aggregators, and we have the same privacy problems on Gemini that the web has.
Security Hole in Gemini Protocol?
I feel this is more of a “privacy hole” than a “security hole” but that's could be me being pedantic. Honestly, I don't feel like this is anything that needs to be worried about. Gemini is much too small to worry about. I suppose a Gemini server could generate client certificates and a compliant Gemini client could accept them for later use to reference a Gemini site, but that's not now client certificates are specified as working— it's the client that generates the certificate and the server can accept or reject it (odd, I know, and not how I would envision them working).
But it's not like there aren't other ways for tracking a user in Gemini. A Gemini server could conceivably generate unique links for a given client from a given IP address. It's not perfect, and it really only kind of tracks a single user. And let's not forget just logging every request and <gasp!> not anonymizing IP addresses! Oh the horror! But such “tracking” is only limited to one server. It seems silly that such tracking could be done Internet wide, especially given that automatically displaying of images is considered scandalous in the Gemini community.
Notice that all of these codes are described in way that implies that the server is already expecting a client certificate for that request. What if there is a certificate attached when not expected? Unless I have missed or misinterpreted something, the spec does not account for this.
61 comes close, but that implies that a cert was indeed expected, it's just the wrong one.
Proposed Solution
Add 4th certificate status code, let's call it 63, to be returned in this scenario. It would not stop malicious or corporate servers from refusing ever to return this code, but it would at least allow users to see which sites are not trying to stalk them, because someone using Flashy Surveillance Browser would be shown this error anytime they visit an indie capsule.
Could this itself be exploited, though? I think so. Proprietary browsers could show a 'security warning' that the capsule they are attempting to access is … insert scary corporate buzzword … and that proceeding would be 'dangerous'. This, of course, would be total horseshit but the normies wouldn't know any better.
Security Hole in Gemini Protocol?
I have two responses to this: One, just do it! Add the check to your Gemini server and return the undocumented response code 63. Yes, it's not part of the standard. Yes, it's extending the protocol (“The horror! The horror!”). But on the gripping hand, it just might help. My own Gemini server serves up a custom error code when it receives an empty request which is expressly not allowed by the specification! I used to serve up a response code of “59 Bad Request” but it never seemed to do anything. I then changed it to return “58 Not a gopher server!” and while it hasn't stopped such requests, they have been slowly going down over the past year or so. So go ahead, just do it! Add the “63 Why are you forcing an unwanted certificate on me?” response.
My second response is—client certificates are dead on the web, what makes you think “proprietary Gemini browsers” will go to this trouble? If anything, I would think a “propriatary Gemini browser” would insist on using a real secure certificate, and not a self-signed one or one using a custom certificate authority, long before it would attempt to force known client certificates on users.
Discussions about this entry
Tuesday, Debtember 27, 2022
And in another timeline, Google sold out to Yahoo for $10,000,000 …
I'm not quite sure what to make of “eπc 2014” (or “Epic 2014”). It's a “what-if” story that diverges from our own timeline in 2004 and goes to some really weird places (Googlezon anyone?). It's a history that never happened, and yet, it still feels like we've just a few years short of it actually happening.