The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Monday, January 01, 2024

Free at last! Free at last! Thank God almighty the mouse is free at last!

Well, at least the 1928 version of Micky Mouse is free and in the public domain. It's no coincidence that Disney has worked the past few years turning the 1928 version of Micky into a trademark. Grab the popcorn! It's going to be an interesting year in copyright law.

Unlike the past few years, this year came in pretty quiet. Yes, there were some fireworks off in the distance, but our neighbors? Very quiet this year. Maybe that's a good sign that this year will be less explosive.

I can hope.

HAPPY NEW YEAR!

Monday, January 08, 2024

I wonder why today, of all days, I'm feeling this level of melancholy?

Yesterday at the “every-other-week” D&D game (played via Zoom these days), I couldn't bring up the virtual map the DM uses because my web browser didn't support WebGL, despite my updating the entire operating system, including the web browser, just 10 minutes earlier.

Sigh. What the XXXX happend to being just 20 minutes out of date?

Even worse, it ran the last time we played a month ago.

Admid suggestions from the rest of the group about what to do (even with the suggestion to not use the map at all and just go for “theater of the mind”), I just gave up and bailed out of the game. I just couldn't cope with the fact that despite my attempts at keeping up to date with all this technological XXXX being forced down our throats, it still wasn't XXXXX­XX good enough.

I think this is the culmination of my feelings toward the latest round of AI. I saw this video with Dr. Matt Welsh who joked that at his company, not using ChatGPT should be a firable offence, and I felt offended at the joke. Are we so XXXXX­XX beholden to our tools that we give up mastery of them? Of course, as the owner of a software company, he loves the idea of ChatGPT to write his software—it keeps his costs down. XXXX people who spent time investing in writing software.

As I commented a month ago:

I wrote about this about fifteen years ago, about the fear of becoming beholden to technology without understanding. IT currently doesn’t incentivize deep learning, or even mastery of anything, both because the perceived need to move fast, and the ever changing technology landscape. Who has time to learn anything any more? Why even bother if it’s all going to change anyway? Especially when we have a program that can do all that pesky learning for us?

Edit to add: Maybe this post describes it better?

LLMs make Programming Language Learning Curves Shallower | Lobsters

Why bother indeed?

And to bring this back around to D&DDeck of DM Things has made series of videos about using AI to run a game:

Even though he wasn't fully successful with the experient, as YouTuber Philip DeFranco is fond of saying, “this is the worst that AI will ever be.”

This has been in the back of my mind for some time now. I left The Enterprise partly because of the testing, and a fear I had was that not that AI would write the tests for us (which I wouldn't mind) but that I would have to write tests for the code AI wrote. It was bad enough having to endure endless computer updates and reboots when the XXXXX­XX computer mandated it—but to be truly enslaved to the machine?

Who controls who?

And it all came crashing through when I couldn't load a XXXXX­XX map on a XXXXX­XX web page.

I don't like the direction the industry is going, with all the constant changes just for its own sake (yes, I know, changes for security is a real thing, but that's not carte blanche to change how the rest of the software works). And often times, it's the computer in change. “Oh! Update me now!” “Oh! Reboot me now!” “Oh! Feed me now, Seymour!”

XXXX that XXXX!

Oh great! The network connection is down. Could be Roko's Basilisk giving me a warning, or just a momentary glitch. Who knows?

Sigh.

I feel like I'm yelling at the clouds to get off my lawn as I adjust my onion on my belt. But on a lighter note, ChatGPT in French is “cat, I farted.” That is truly wonderful.

Thursday, January 18, 2024

“Now, here, you see, it takes all the running you can do, to keep in the same place.”

I'm a bit relunctant to write this, as I'll come across as an old man yelling at the clouds to get off his lawn, but the whole “update treadmill” the Computer Industry has foisted on us is getting tiresome.

Bunny now wears a CGM prescribed by her doctor. It's a small disk that adhears to the back of the upper arm and sends readings of blood sugar via Bluetooth. Bunny has an app on her smartphone that records the information and forwards it to her doctor. The app is fine; no real problems with using it.

Until this morning.

Bunny woke me up to inform me that the app just stopped working, because her smartphone hadn't been updated in the previous 20 minutes.

What the XXXX

It was working fine the previous night. What updates were required? And why drop support for older operating systems? Oh yeah … right … it's hard to support systems older than 20 minutes.

And some PM somewhere needed to justify their job.

Sigh.

Things are working fine, how that the operating system on her smart phone was updated (only took several hours). But still … gah!


“… and water is wet! Film at 11!”

Time for more yelling at the clouds.

A few days ago I was surfing on my iPad when I came across An Astronomy Club of Brevard, NC. Now, my iPad is a bit, shall we say, slightly out of date? So it was with sadness when I saw:

Your Browser Is
No Longer Supported

To view this website and enjoy a better online experience,
update your browser for free.

Followed by a list of browsers.

You know, I thought we left that behind in the early 2000s, but apparently not.

I then viewed the site on my desktop computer and … why? Why do I need a less-than-20-minute old browser to view seven images and some text? Why does it take 193 requests to even show the page? At a minimum, you have the HTML, CSS and seven images, so … nine requests? Okay, maybe some Javascript to do the animations designers are so fond of.

But 193 files?

I'm not blaming the Astronomy Club of Brevard for this. They're using Wix to create and host the website, so I'm laying the blame solely at Wix here for going completely overboard with the JavaScript. Do web developers know you can create a perfectly good site with just HTML and CSS?

Yeah, I'm yelling at the tide to stop coming in.

Sigh.

Friday, January 19, 2024

Complicating code

I recently added an .OPT directive to my 6809 assembler. This allows me to add options to a source file instead of having to always specify them on the command line. I originally did this to support unit testing and I feel it's a nice addition.

When the test backend is enable, all the memory of the emulated 6809 is marked as non-readable, non-writable, non-executable. As the code is assembled, memory used by instructions are switched to “readable, executable” and memory used by data becomes “readable, writable” (easy, because of an early decision to have separate functions to write instructions vs. data). If you reference memory outside of the addresses used by the source code being assembled, you have to specify the permissions of said addresses.

; The set up for our tests.
; We should only read from these locations

	.opt	test prot r,ECB.beggrp,ECB.beggrp + 1
	.opt	test prot r,$112

; We can read and write to these locations

	.opt	test prot rw,$0E00,$0E00 + 1023

; Set the stack for our tests

	.opt	test stack $FF00

; Initialize some VM memory

	.opt	test memw,ECB.beggrp,$0E00
	.opt	test memb,$112,2

You really only need these for memory locations defined outside the source code. In the example above, the memory referenced is defined by the Color Computer and not by the code being tested, so they need to be initialized. And because the system only supports 65,536 bytes of memory, we can easily (on modern systems) assign permissions per byte.

So this all works and is great at finding bugs.

But then I thought—I can use .OPT to supress warnings as well. If I assemble the following code:

;**************************************************************************
;	frame_buffer		set frame buffer address	(GPL3+)
;Entry:	A - MSB of frame buffer
;Exit:	D - trashed
;**************************************************************************

frame_buffer	ldb	PIA0BC		; wait for vert. blank
		bpl	frame_buffer
.now		stx	,--s		; save X
		ldx	#SAM.F6		; point to framebuffer address bits
.setaddress	clrb			; reset B
		lsla			; get next address bit
		rolb			; isolate it
		stb	b,x		; and set the SAM F bit
		leax	-2,x		; point to next F bit register
		cmpx	#SAM.F0		; more?
		bhs	.setaddress	; more ...
		puls	x,pc		; return

I get

frame_buffer.asm:9: warning: W0002: symbol 'frame_buffer.now' defined but not used

The subroutine has effectively two entry points—frame_buffer will wait until the next vertical blanking interrupt before setting the address, while frame_buffer.now will set it immediately. The former is good if you are using a double-buffer system for graphics, while the later is fine for just switching the address once. Given that my assembler will warn on unused labels, this means I'll always get this error when including this code. I can supress that warning by issuing a -nW0002 on the command line, but then this will miss other unused labels that might indicate an actual issue.

I wanted to have something like this:

	.opt	* disable W0002
frame_buffer	...
.now		...

		puls	x,p
	.opt	* enable W0002

We first disable the warning for the code fragment, then afterwards we enable it again. I coded this all up, but it never worked. It worked for other warnings, but not this particular one.

The assembler is a classic two-pass assembler, but not all warnings are issued during the passes, as can be seen here (using a file that generates every possible warning with debug output enabled):

[spc]lucy:~/source/asm/a09/misc>../a09 -ftest -d -o/dev/null warn.asm
warn.asm: debug: Pass 1
warn.asm:2: warning: W0010: missing initial label
warn.asm:10: warning: W0008: ext/tfr mixed sized registers
warn.asm:11: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters
warn.asm:16: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters
warn.asm:21: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters
warn.asm:23: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters
warn.asm:36: warning: W0013: label 'a' could be mistaken for register in index
warn.asm:37: warning: W0013: label 'b' could be mistaken for register in index
warn.asm:38: warning: W0013: label 'd' could be mistaken for register in index
warn.asm: debug: Pass 2
warn.asm:2: warning: W0003: 16-bit value truncated to 5 bits
warn.asm:3: warning: W0004: 16-bit value truncated to 8 bits
warn.asm:4: warning: W0005: address could be 8-bits, maybe use '<'?
warn.asm:5: warning: W0006: offset could be 5-bits, maybe use '<<'?
warn.asm:8: warning: W0005: address could be 8-bits, maybe use '<'?
warn.asm:9: warning: W0007: offset could be 8-bits, maybe use '<'?
warn.asm:11: warning: W0009: offset could be 8-bits, maybe use short branch?
warn.asm:13: warning: W0011: 5-bit offset upped to 8 bits for indirect mode
warn.asm:25: warning: W0012: branch to next location, maybe remove?
warn.asm:26: warning: W0012: branch to next location, maybe remove?
warn.asm:43: warning: W0017: cannot assign the stack address within .TEST directive
warn.asm:42: debug: Running test test
warn.asm:42: warning: W0014: possible self-modifying code
warn.asm:42: warning: W0016: memory write of 00 to 0034
warn.asm:42: warning: W0015: : reading from non-readable memory: PC=0016 addr=F015
warn.asm:42: debug: Post assembly phases
warn.asm:2: warning: W0002: symbol '.start' defined but not used
[spc]lucy:~/source/asm/a09/misc>

You can see some are generated during pass 1, some during pass 2. The message “Running test test” happens after the second pass is done, and the one I'm trying to supress, W0002, at the very end of the program. The .OPT directives are processed during passes 1 and 2. There's just no easy way to supress W0002 just for a portion of the code, as I would have to carry forward that any labels defined between the disable and enable of W0002 should be exempt from the “no-label warning” check.

It's issues like these that complicate programs over time.

I was about to scrap the idea when I came up with a solution. Each symbol in the symbol table has a reference count. At the end of assembly, there's code that goes through the symbol table and issues the warning if a label has a reference count of 0. All I did was create another option, .OPT * USES <label>, to increment the reference count of a label. At the end of the day, it works. I'm not saying this is a “good” solution, just “a” solution.

Wednesday, January 24, 2024

Well, that was weird

Bunny's pharmacy just called to remind us to call and remind them to renew Bunny's prescription. Um … okay?


Adventures in high dining

Bunny and I found oursevles at La Nouvelle Maison, an upscale eating establishment in downtown Boca Raton. I'm not sure why she picked the place—perhaps because our previous upscale dining establishment, Root Italian Kitchen is no longer open. Or perhaps it's “just because.”

In any case, we had a reservation for 8:00 pm. I made it yesterday, as Bunny was running about Chez Boca worrying about the dress code (turns out—“business casual” so I don't have to don my International Man of Mystery clothes). I received a text notification a few hours later, and a phone call today, asking to confirm our time.

We arrived at 8:00 pm after a slight detour through the valet parking lot of Trattoria Romana next door (the driving directions from Mr. Google were slightly unclear on that). We walked right in, stated our reservation name and time, and were immedately seated in the quite crowded restaurant. I found that quite surprising, given how busy it all looked.

One thing about high dining that I don't think I'll get used too—the level of attention to detail. I think our waiter only had three or four other tables to wait on, and when he wasn't busy, he was standing nearby, waiting at attention. And if we was busy, his assistant (Assistant!) was nearby waiting to serve us.

And boy, did we get served. A cheese platter of assertive goat, a more assertive bleu and a brie that gave butter a run for its money. We decided not to get any caviar, but we did get a Soupe à L’Oignon gratinee for me and Pâté de Foie Gras for Bunny. Those were followed by a Petite Laitue Roquefort for me, and a Salade de Betterave for Bunny.

And then the entrees. I ordered the Risotto de Saint-Jacques à la truffe et aux épinards while Bunny had the Côte de veau aux Champignons Forresterie. I think the most amazing thing about Bunny's dinner was the gratin potatoes. Our waiter explained how they were made—the potatoes were sliced paper thin (thin enough to see through) and then layered with gruyere cheese, pressed down with a brick for over 24 hours in the refriderator, then baked and just before serving, deep fried for about a minute. It was perfect cube of potato and cheese about 3″ (7.5cm) per side, made up of easily a hundred layers.

Desert was a chocolate soufflé. That was my first soufflé, and also my last. I'm not saying it was bad—it wasn't by any stretch of the imagination—it was delicious! But my issue with the soufflé was one of texture. I'm sensitive to some … textures … that some foods have, and this one triggered that sensitivity. The only reason I ate more than one spoonfull was just down to how good it was, and I was able to bull through my textural revulsions. In all my times eating at find dining establishments, I've found that even if I don't like a particular item, it will be the best particular item I've ever had.

Was everything we had good? It's French! Of course it was good. All of it. The service was incredible, and our only real complaint of the night was the noise level when we first arrived—it was a bit too high for our liking. And the price was about what we expected—it's the “once a year” level of price.

Friday, January 26, 2024

“It is just a barebones framework written as a love letter to the language of C.”

I'm used to having some of my posts make it to the Orange Site, but I was not expecting to have one of my software projects (in this case, mod_blog) make it to the Orange Site. I'm finding it especially amusing to read this thread. I did not expect mod_blog to be a “love letter to C.”

Wednesday, January 31, 2024

The Repair Culture, Part II—Electric DeoxIToo

I've repaired my Logitech Trackman Marble a few times over the years, but this time the traditional fix didn't work. But I have seen enough Adrian Black computer repair videos to know that perhaps, a liberal application of DeoxIT Spray Contact Cleaner on the buttons might work. I ordered some last week, and when I went out to check the mail, I found it sitting on top of the mail box! I'm surprised it didn't fall off.

Anyway, a liberal amount of DeoxIT in the buttons and yup, that did the trick this time.

Sweet.


Dear LinkedIn, I don't think I'm the expert you want answering these questions

I cross post links to my blog on LinkedIn mainly to be contrarian, or at least think I'm providing something different than all the business related stuff posted there. But lately, I've seen notifications there like, “You're one of the few experts invited to add this collaborative article: …” and it's some article about AI, or Agile or the latest one, “Your team is falling behind on project deadlines. What web-based tools can help you catch up?”

Oh … nothing good can come from me answering these questions.


I know languages that have support for “read-only memory,” but what about “write-only memory?”

I'm still hacking away on my overengineered 6809 assembler and one feature I've beem mulling over is a form of static access checking. I have byte-level access control when running tests but I'm thinking that adding some form of “assemble time checking” would also be good. I've been writing code hitting the hardware of the Color Computer, and there are semantics around hardware that I don't think many languages (or even assembler) support—write only memory! As the Amiga Hardware Reference Manual states (only referenced here because it's a good example of what I'm talking about):

Registers are either read-only or write-only. Reading a write-only register will trash the register. Writing a read-only register will cause unexpected results.

When strobing any register which responds to either a read or a write, (for example copjmp2) be sure to use a MOVE.W, not CLR.W. The CLR instruction causes a read and a clear (two access) on a 68000, but only a single access on 68020 processors. This will give different results on different processors.

The Color Computer isn't quite as finicky (although the 6809 inside it also does the “read, then write” thing with the CLR instruction), but there is still memory-mapped IO that is read-only, some that is write-only, and some that has different meanings when reading and writing. And while C has some semantic support with volatile (to ensure reads and writes happen when stated) and const (to ensure the read-only nature) it still lacks a “write-only” concept. And I've never used an assembler that had “write-only” semantics either.

I'm thinking something along these lines:

		org	$FF40
DSK.CTRL	rmb/w	1	; write only

		org	$FF48
DSK.CMD		rmb/w	1	; write only
		org	$FF48
DSK.STATUS	rmb/r	1	; read only

DSK.TRACK	rmb	1	; these can be read and written
DSK.SECTOR	rmb	1
DSK.DATA	rmb	1

Here, the RMB directive just reserves a number of bytes, with a default access of “read-write.” The /W or /R designates that label as being either “write-only” or “read-only.” And if you look closely, you'll see that both DSK.CMD and DSK.STATUS are defined as the same address. It's just that writing to that address will send a command to the drive controller, while reading from that address give the current status. The only issue I have are hardware registers that can be programmed for input or output. The MC6821 used in the Color Computer has such registers, and the issue I have is how to signify this change in state in a program—that at this point in the program, such-n-such address is “read-write” but afterwards, it's “write-only.”


Discussions about this entry


How much for the book? I don't think I paid that much for the computer

Now, about that Amiga Hardware Reference Manual link in my previous post$577.99‽

Seriously‽

I know it's now a niche market, but are there really people buying that book for almost $600?

Good Lord!

Friday, February 02, 2024

Making it to the Orange Site

A previous post made it to Lobster only to later show up at the Orange Site. How about that?


Making it to the Orange Site

A previous post made it to Lobster only to later show up at the Orange Site. How about that?


Making it to the Orange Site

A previous post made it to Lobster only to later show up at the Orange Site. How about that?


It should be obvious by now what day it is, but just in case

I want to ensure everybody that yes, I intentionally made the previous three posts. And while they are similar, they are not in fact, identical. You know, much like Ground Hog Day.

Monday, February 05, 2024

The difficulties in supporting “write-only memory” in assembly

When I last wrote about this, I had one outstanding problem with static analysis of read-only/write-only memory, and that was with hardware that could be input or output only. It was only after I wrote that that I realized the solution—it's the same as a hardware register having different semantics on read vs. write—just define two labels with the semantics I want. So for the MC6821, I could have:

		org	$FF00
PIA0.A		rmb/r	1	; read only
		org	$FF00
PIA0.Adir	rmb/w	1	; write only, to set the direction of each IO pin
PIA0.Acontrol	rmb	1	; control for port A

So that was a non-issue. It was then I started looking over some existing code I had to see how it might look. I didn't want to just jump into an implementation without some forethought, and I quickly found some issues with the idea by looking at my maze generation program. The code in question initializes the required video mode (in this case 64×64 with four colors). Step one involves writing a particular value to the MC6821:

		lda	#G1C.PIA ; 64x64x4
		sta	PIA1.B

So far so good. I can mark PIA1.B as write-only (technically, it also has some input pins so I really can't, but in theory I could).

Now, the next bit requires some explaining. There's another 3-bit value that needs to be configured on the MC6883, but it's not as simple as writing the 3-bit value to a hardware register—each bit requires writing to a different address, and worse—it's a different address if the bit is 0 or 1. So that's six different addresses required. It's not horrible though—the addresses are sequential:

6883 VDG Addressing Mode
bit 0/1 address
V0 0 $FFC0
V0 1 $FFC1
V1 0 $FFC2
V1 1 $FFC3
V2 0 $FFC4
V2 1 $FFC5

Yeah, to a software programmer, hardware can be weird. To set bit 0 to 0, you do a write (and it does not matter what the value is) to address $FFC0. If bit 0 is 1, then it's a write to $FFC1. So with that in mind, I have:

		sta	SAM.V0 + (G1C.V & 1<<0 <> 0)
		sta	SAM.V1 + (G1C.V & 1<<1 <> 0)
		sta	SAM.V2 + (G1C.V & 1<<2 <> 0)

OOh. Yeah.

I wrote it this way so I wouldn't have to look up the appropriate value and write the more opaque (to me):

		sta	$FFC1
		sta	SFFC2
		sta	$FFC4

The expression (G1C.V & 1<<n <> 0) checks bit n to see if it's set or not, and returns 0 (for not set) or 1 (for set). This is then added to the base address for bit n, and it all works out fine. I can change the code for, say, the 128×192 four color mode by using a different constant:

		lda	#G6C.PIA
		sta	PIA1.B
		sta	SAM.V0 + (G6C.V & 1<<0 <> 0)
		sta	SAM.V1 + (G6C.V & 1<<1 <> 0)
		sta	SAM.V2 + (G6C.V & 1<<2 <> 0)

But I digress.

This is a bit harder to support. The address being written is part of an expression, and only the label (defining the address) would have the read/write attribute associated with it. At least, that was my intent. I suppose I could track the read/write attribute by address, which would solve this particular segment of code.

And the final bit of code to set the address of the video screen (or frame buffer):

		ldx	#SAM.F6		; point to frame buffer address bits
		lda	ECB.grpram	; get MSB of frame buffer
mapframebuf	clrb
		lsla
		rolb
		sta	b,x		; next bit of address
		leax	-2,x
		cmpx	#SAM.F0
		bhs	mapframebuf

Like the VDG Address Mode bits, the bits for the VDG Address Offset have unique addresses, and because the VDG Address Offset has seven bits, the address is aligned to a 512 byte boundary. Here, the code loads the X register with the address of the upper end of the VDG Address Offset, and the seven top most bits of the video address is sent, one at a time, to the B register, which is used as an offset to the X register to set the appropriate address for the appropriate bit. So now I would have to track the read/write attributes via the index registers as well.

That is not so easy.

I mean, here, it could work, as the code is all in one place, but what if instead it was:

		ldx	#SAM.F6
		lda	ECB.grpram
		jsr	mapframebuf

Or an even worse example:

costmessage	fcc/r	"A constant message" ; read only text
buffer		rmb	18

		ldx	#constmessage
		ldy	#buffer
		lda	#18
		jsr	memcpy

The subroutine memcpy might not even be in the same source unit, so how would the read/write attribute even be checked? This is for static analysis, not runtime.

I have one variation on the maze generation program that generates multiple mazes at the same time, on the same screen (it's fun to watch) and as such, I have the data required for each “maze generator” stored in a structure:

explorec	equ	0	; read-only
backtrackc	equ	1	; read-only
xmin		equ	2	; read-only
ymin		equ	3	; read-only
xstart		equ	4	; read-only
ystart		equ	5	; read-only
xmax		equ	6	; read-only
ymax		equ	7	; read-only
xpos		equ	8	; read-write
ypos		equ	9	; read-write
color		equ	10	; read-write
func		equ	11	; read-write

This is from the source code, but I've commented each “field” as being “read-only” or “read-write.” That's another aspect of this that I didn't consider:

		lda	explorec,x	; this is okay
		sta	explorec,x	; this is NOT okay

Not only would I have to track read/write attributes for addresses, but for field accesses to a structure as well. I'm not saying this is impossible, it's just going to take way more thought than I thought. I don't think I'll have this feature done any time soon …

Tuesday, February 06, 2024

So you want to amplify my SEO

From
Krystal XXXXX­XX <XXXXX­XXXXX­XXXXX­XXXXX@gmail.com>
To
sean@conman.org
Subject
Amplify Your SEO with Strategic Link Inserts
Date
Wed, 7 Feb 2024 01:16:35 +0300

Hi Content Team,

It’s Krystal XXXXX­XX here from Next Publisher, your next potential partner in digital storytelling. We're thrilled about the idea of featuring both guest posts and link insertions on your dynamic website.

We would like to know your fee structure for hosting guest posts and link insertions. Our team aims to create compelling content that is tailored to your site’s audience and enhances your overall content strategy.

A quick note: this initial email is only for starting our dialogue. All more detailed communications, including agreements and transactions, will be carried out through our official Next Publisher email.

We admire the quality of your platform and are excited to explore how we can work together for mutual benefit.

Looking forward to your prompt reply.

Warm wishes,

Krystal XXXXX­XX

Hello Krystal.

Since you neglected to include a link (in an email sent as HTML no less!) it's hard for me judge the value Next Publisher will provide for my site, so I'm going to have to adjust my prices accordingly. My fee for both guest posts and link insertions is $10,000 (US) per. So if a guest post also includes a link insertion, that would be a total of $20,000 (US). If I'm going to be whoring selling out what Google Page Rank I have, it's going to cost.

I look forward to hearing back from you.

Sean.


Okay! I'll answer your question, LinkedIn. Also, Orange Site! Orange Site! Orange Site!

Today's “you're one of the few experts invited to add this collaborative article” from LinkedIn is “How do you become a senior application developer?” My answer? Stay in a programming job for three years. Boom! You're a senior application developer. I know, I know, that's a big ask these days when everybody is jumping ship every two years. But hey, if you want to be a “senior application developer,” you got to make sacrifices.

Oh, and the title to today's post? I found out that LinkedIn really liked when I mentioned the Orange Site in the title of my post. Almost two orders of magnitude more. So I'm doing a test to see if I can game the system there.

Wednesday, February 07, 2024

Instead of “write-only memory” assembly support, how about floating point support?

You might think it odd to add support for floating point constants for an 8-bit CPU, but Motorola did development on the MC6839 floating point firmware for the MC6809, an 8K ROM of thread-safe, position-independent 6809 code that implements the IEEE Standard for Floating-Point Arithmetic. It was never formally released by Motorola as a product, but from what I understand, it was released later under a public domain license. At the very least, it's quite easy to MC6839 find both the ROM image and the source code on the Intarwebs. So that's one reason.

Another reason is that the Color Computer BASIC supports floating point operations, and while not IEEE-754, as it was written before the IEEE-754 standard become a standard, it still floating point, and there are only minor differences between it and the current standard, namely the exponent bias, number of fractional bits supported, and where the sign bit is stored. It really comes down to some bit manipulations to massage a standard float into the Color Computer BASIC float format. There are some differences, but the differences are small (literally, on the scale of 0.0000003) probably due to parsing differences, and small enough that it should be “good enough.” Especially since the Color Computer BASIC float format doesn't support infinity or NaN.

So if you specify a backend other than the rsdos backend, you get IEEE-754, and if you do specify rsdos as a backend, you get the Color Computer BASIC float format.

And yes, I added support for floating point expressions (but not for the test backend—I'm still thinking on how to support it), and one interesting feature I added is the factorial operator “!”. Factorials are used in Talor series, which the Color Computer BASIC uses for the sin() function, so I can literally write:

	; Oh!  '**' is exponentiation by the way!
taylor_series	.float	-((2 * 3.14159265358979323846) ** 11) / 11!
		.float	 ((2 * 3.14159265358979323846) **  9) /  9!
		.float	-((2 * 3.14159265358979323846) **  7) /  7!
		.float	 ((2 * 3.14159265358979323846) **  5) /  5!
		.float	-((2 * 3.14159265358979323846) **  3) /  3!
		.float	   2 * 3.14159265358979323846

and have it generate the correct values. I personally don't know of any language that has a factorial operator (maybe APL? I don't know).

I think I'm having more fun writing the assembler than I am writing assembly code.

Sunday, February 11, 2024

An extensible programming language

A few days ago I wrote about adding a factorial operator to my assembler, and I noted that I knew not of any other languages that had such a feature. So imagine my surprise as I'm reading about XL (via Lobsters) and the second example is factorial! Not only that, but that was an example of extending the language itself! The last time I was this excited about software was reading about Synthesis OS, a JIT-based operating system where you could create your own system calls.

How it handles precedence is interesting. In my assembler, I have left or right associativity as an explicit field, whereas in XL, it's encoded in the precedence level itself—even if its left, odd if its right. I'm not sure how I feel about that. On the one hand it feels nice and it's one less field to carry around; on the other, being explicit as I did makes it clear if something is left or right. But on the gripping hand, it sounds like matching precedence on a left and right operator could lead to problems, so I still may have an explicitness problem.

But I digress.

It's a very simple language with only one keyword “is” and a user-definable precedence table. The parser generates a parse tree of only eight types, four leaf nodes (integer, real, text, name (or symbol)) and four non-leaf nodes (prefix, infix, postfix and block). And from there, you get XL.

This is something I definitely want to look into.

Wednesday, February 14, 2024

Notes from an overheard conversation from a car attempting a right turn

“Oh! Now what?”

“Sir, see the lit sign up there? You cannot turn right.”

“But did you not see the car right in front of me turning right?”

“Sir, if a person jumped off a bridge, would you follow?”

“Yes.”

“You must be very smart then.”

“And selective enforcement of the laws leads to distrust of the police.”

“Sir—”

“Oh look! The ‘No Right Turn’ sign is off now! Gotta go! Bye!”

“I don't think it's wise to taunt the poice like that.”

“Down with the Man! Power to the people! Yo!”

Wednesday, February 28, 2024

Converting IEEE-754 floating point to Color BASIC floating point

I'm still playing around with floating point on the 6809—specifically, support for floating point for the Color Computer. The format for floating point for Color BASIC (written by Microsoft) predates the IEEE-754 Floating Point Standard by a few years and thus, isn't quite compatible. It's close, though. It's defined as an 8-bit exponent, biased by 129, a single sign bit (after the exponent) and 31 bits for the mantissa (the leading one assumed). It also does not support ±∞ nor NaN. This differs from the IEEE-754 single precision that uses a single sign bit, an 8-bit exponent biased by 127 and 23 bits for the mantissa (which also assumes a leafing one) and support for infinities and NaN. The IEEE-754 double precision uses a single sign bit, an 11-bit exponent biased by 1023 and 52 bit for the mantissa (leading one assumed) plus support for infinities and NaN.

So the Color BASIC is about halfway between single precision and double precision. This lead me to use IEEE-754 double precision for the Color Computer backend (generating an error for inifinities and NaN) then massaging the resulting double into the proper format. I double checked this by finding some floating point constants in the Color BASIC ROM as shown in the book Color BASIC Unravelled II, (available on the Computer Computer Archives), like this table:

4634				* MODIFIED TAYLOR SERIES SIN COEFFICIENTS
4635	BFC7 05			LBFC7	FCB	6-1			SIX COEFFICIENTS
4636	BFC8 84 E6 1A 2D 1B	LBFC8	FCB	$84,$E6,$1A,$2D,$1B	* -((2*PI)**11)/11!
4637	BFCD 85 28 07 FB F8	LBFCD	FCB	$86,$28,$07,$FB,$F8	*  ((2*PI)**9)/9!
4638	BFD2 87 99 68 89 01	LBFD2	FCB	$87,$99,$68,$89,$01	* -((2*PI)**7)/7!
4639	BFD7 87 23 35 DF E1	LBFD7	FCB	$87,$23,$35,$DF,$E1	*  ((2*PI)**5)/5!
4640	BFDC 86 A5 5D E7 28	LBFDC	FCB	$86,$A5,$5D,$E7,$28	* -((2*PI)**3)/3!
4641	BFE1 83 49 0F DA A2	LBFE1	FCB	$83,$49,$0F,$DA,$A2	*    2*PI

Then using the byte values to populate a variable and printing it inside BASIC (this is the expression -2π3/3!):

X=0         ' CREATE A VARIABLE
Y=VARPTR(X) ' GET ITS ADDRESS
POKE Y,&H86 ' AND SET ITS VALUE
POKE Y+1,&HA5 ' THE HARD WAY
POKE Y+2,&H5D
POKE Y+3,&HE7
POKE Y+4,&H28
PRINT X ' LET'S SEE WHAT IT IS
-41.3417023

Then using that to create a floating point value:

	org	$1000
	.float	-41.3417023
	end

Checking the resulting bytes that were generated:

                         | FILE ff.a
                       1 |         org     $1000
1000: 86A55DE735       2 |         .float  -41.3417023
                       3 |         end

And adjusting the floating point constant until I got bytes that matched:

                         | FILE ff.a
                       1 |         org     $1000
1000: 86A55DE728       2 |         .float  -41.341702110
                       3 |         end

I figure it's “close enough.” The parsing code in the Color BASIC ROM is old and predates the IEEE-754 floating point standard, so a few different digits at the end I think is okay.

As a final check, I wrote the following bit of code to calculate and display -2π3/3!, display the pre-calculated result, as well as display the pre-calculated value of 2π:

		include	"Coco/basic.i"
		include	"Coco/dp.i"

CB.FSUBx	equ	$B9B9	; FP0 = X   - FP0	; addresses for
CB.FSUB		equ	$B9BC	; FP0 = FP1 - FP0	; these routines 
CB.FADDx	equ	$B9C2	; FP0 = X   + FP0	; from
CB.FADD		equ	$B9C5	; FP0 = FP1 + FP1	; Color BASIC Unravelled II
CB.FMULx	equ	$BACA	; FP0 = X   * FP0
CB.FMUL		equ	$BAD0	; FP0 = FP0 * FP1
CB.FDIVx	equ	$BB8F	; FP0 = X   / FP0
CB.FDIV		equ	$BB91	; FP0 = FP1 / FP0

CB.FP0fx	equ	$BC14	; FP0 = X
CB.xfFP0	equ	$BC35	; X   = FP0
CB.FP1f0	equ	$BC5F	; FP1 = FP0
CB.FP0txt	equ	$BDD9	; result in X, NUL terminated

		org	$4000
start		ldx	#tau		; point to 2*pi
		jsr	CB.FP0fx	; copy to FP0
		ldx	#tau		; 2PI * 2PI
		jsr	CB.FMULx
		ldx	#tau		; 2PI * 2PI * 2PI
		jsr	CB.FMULx
		jsr	CB.FP1f0	; copy fp acc to FP1
		ldx	#fact3		; point to 3!
		jsr	CB.FP0fx	; copy to FP0
		jsr	CB.FDIV		; FP0 = FP1 / FP0
		neg	CB.fp0sgn	; negate result by flippping FP0 sign
		jsr	CB.FP0txt	; generate string
		bsr	display		; display on screen

		ldx	#answer		; point to precalculated result
		jsr	CB.FP0fx	; copy to FP0
		jsr	CB.FP0txt	; generate string
		bsr	display		; display

		ldx	#tau		; now display 2*pi
		jsr	CB.FP0fx	; just to see how close
		jsr	CB.FP0txt	; it is.
		bsr	display
		rts

display.char	jsr	[CHROUT]	; display character
display		lda	,x+		; get character
		bne	.char		; if not NUL byte, display
		lda	#13		; go to next line
		jsr	[CHROUT]
		rts

tau		.float	6.283185307
fact3		.float	3!
answer		.float	-(6.283185307 ** 3 / 3!)

		end	start

The results were:

-41.3417023
-41.3417023
 6.23418531

The calculation results in -41.3417023 and the direct result stored in answer also prints out -41.3417023, so that matches and it reinforces my approach to this nominally right.

But I think Microsoft had issues with either generating some of the floating point constants for the larger terms, or transcribing the byte values of the larger terms. Take for instance -2π11/11!. The correct answer is -15.0946426, but the bytes in the ROM define the constant -14.3813907, a difference of .7. And it's not like Color BASIC can't calculate that correctly—when I typed in the expression by hand, it was able to come up with -15.0946426.

Or it could be that Walter K. Zydhek, the author of Color BASIC Unravelled II, is wrong in his interpretation of the expressions used to generate the values, or his interpretation of what the values are used for. I'm not sure who is at fault here.

Update on Friday, March 1st, 2024

I was wrong about the authorship of Color BASIC Unravelled II. It was not Walter K. Zydhek, but some unknown author of Spectral Associates, a company that is no longer in business. All Zydhek did was to transcribe a physical copy of the book (which is no longer available for purchase anywhere) into a PDF and make it available.


Discussions about this entry

Friday, March 01, 2024

The speed of Microsoft's BASIC floating point routines

I was curious about how fast Microsoft's BASIC floating point routines were. This is easy enough to test, now that I can time assembly code inside the assembler. The code calculates -2π3/3! using Color BASIC routines, IEEE-754 single precision and double precision.

First, Color BASIC:

	.tron	timing
ms_fp		ldx	#.tau
		jsr	CB.FP0fx	; FP0 = .tau
		ldx	#.tau
		jsr	CB.FMULx	; FP0 = FP0 * .tau
		ldx	#.tau
		jsr	CB.FMULx	; FP0 = FP0 * .tau
		jsr	CB.FP1f0	; FP1 = FP0
		ldx	#.fact3
		jsr	CB.FP0fx	; FP0 = 3!
		jsr	CB.FDIV		; FP0 = FP1 / FP0
		neg	CB.fp0sgn	; FP0 = -FP0
		ldx	#.answer
		jsr	CB.xfFP0	; .answer = FP0
	.troff
		rts

.tau		fcb	$83,$49,$0F,$DA,$A2 
.fact3		fcb	$83,$40,$00,$00,$00  
.answer		rmb	5
		fcb	$86,$A5,$5D,$E7,$30	; precalculated result

I can't use the .FLOAT directive here since that only supports either the Microsoft format or IEEE-754 but not both. So for this test, I have to define the individual bytes per float. The last line is what the result should be (by checking a memory dump of the VM after running). Also, .tao is , just in case that wasn't clear. This ran in 8,742 cycles, taking 2,124 instructions and 4.12 cycles per instruction (I modified the assembler to record this additional information).

Next up, IEEE-754 single precision:

	.tron	timing
ieee_single	ldu	#.tau
		ldy	#.tau
		ldx	#.answer
		ldd	#.fpcb
		jsr	REG
		fcb	FMUL	; .answer = .tau * .tau

		ldu	#.tau
		ldy	#.answer
		ldx	#.answer
		ldd	#.fpcb
		jsr	REG
		fcb	FMUL	; .answer = .answer * .tau

		ldu	#.answer
		ldy	#.fact3
		ldx	#.answer
		ldd	#.fpcb
		jsr	REG
		fcb	FDIV	; .answer = .answer / 3!

		ldy	#.answer
		ldx	#.answer
		ldd	#.fpcb
		jsr	REG
		fcb	FNEG	; .answer = -.answer
	.troff
		rts

.fpcb		fcb	FPCTL.single | FPCTL.rn | FPCTL.proj
		fcb	0
		fcb	0
		fcb	0
		fdb	0

.tau		.float	6.283185307
.fact3		.float	3!
.answer		.float	0
		.float	-(6.283185307 ** 3 / 3!)

The floating point control block (.fpcb) configures the MC6839 to use single precision, normal rounding and projective closure (not sure what that is, but it's the default value). And it does calculate the correct result. It's amazing that code written 42 years ago for an 8-bit CPU works flawlessly. What it isn't is fast. This code took 14,204 cycles over 2,932 instructions (average 4.84 cycles per instruction).

The higher than average cycle type could be due to position independent addressing modes, but I'm not entirely sure what it's doing to take nearly twice the time. The ROM does use the IEEE-754 extended format (10 bytes) internally, with more bit shifts to extract the exponent and mantissa, but twice the time?

Perhaps it's code to deal with ±∞ and NaNs.

The IEEE-754 double precision is the same, except for the floating point control block configuring double precision and the use of .FLOATD instead of .FLOAT; otherwise the code is identical. The result, however, isn't. It took 31,613 cycles over 6,865 instructions (average 4.60 cycles per instruction). And being twice the size, it took nearly twice the time as single precision, which is expected.

The final bit of code just loads the ROMs into memory, and calls each function to get the timing:

		org	$2000
		incbin	"mc6839.rom"
REG		equ	$203D	; register-based entry point

		org	$A000
		incbin	"bas12.rom"

	.opt	test	prot	rw,$00,$FF	; Direct Page for BASIC
	.opt	test	prot	rx,$2000,$2000+8192 ; MC6839 ROM
	.opt	test	prot	rx,$A000,$A000+8192 ; BASIC ROM

	.test	"BASIC"
		lbsr	ms_fp
		rts
	.endtst

	.test	"IEEE-SINGLE"
		lbsr	ieee_single
		rts
	.endtst

	.test	"IEEE-DOUBLE"
		lbsr	ieee_double
		rts
	.endtst

Really, the only surprising thing here was just how fast Microsoft BASIC was at floating point.

Monday, April 01, 2024

Notes on an overheard conversion while eating dinner at The Cracker Barrel

“Oh no. This is bad.”

“Wow! Are you sure there are no more moves you can make?”

“Nope. See?”

“How many pegs is that?”

“10.”

“Wow! You are really bad at that!”

[If you add up all the possible ways leaving just one peg in the Peg Game, including rotations, reflections and reflected rotations, you have 438,984 ways of solving the Peg Game. If you add up all the possible ways of leaving 10 pegs, including rotations, reflections and reflected rotations, you have just six solutions. It is a much harder problem leaving 10 pegs than leaving one. I'm just saying … —Editor]

Tuesday, April 02, 2024

It only took 25 years for my idea to catch on

I was exchanging emails with Christian about online document structure when I mentioned The Electric King James Bible and it's rather unique addressing scheme. I came up with that 25 years ago [Good Lord! Has it been that long? —Sean] [Yes. —Editor] [Yikes! —Sean] to precisely pull up Bible verses—anywhere from one verse to an entire book. Of all the Bible sites across the Intarwebs I've come across since have never used such an elegant, and to me, obvious, way of referencing the Bible online. Usually they use a URL format like <https://bible.example.org/?bible=kj&book=Genesis&chapter=1&start_verse=1&end_verse=1>.

But Christian mentioned Sefaria as using my method, and true enough, it does! <https://www.sefaria.org/Genesis.6:9-9:17> does indeed go to the Noah's Ark story. I think that's neat! I don't know if they were inspired by my site (unlikely, but not completely out of the relms of possibility) or just came up with it on their own, but it's nice to see someone else is using an easy to hack URL form for Bible references.

There are differences though—my site only brings up the requested material, whereas Sefaria implements a bidirectional “Scroll Of Doom” where additional material appears when you go up or down. I can't say I'm a fan of that, but it apparently works for them.


Dear LinkedIn, why are you still asking me these questions?

LinkedIn is still asking me to participate as an expert answering questions—this time, “You're a system architect. How do you decide which programming languages to learn?” And just below that is “Powered by AI and the LinkedIn community.”

Sigh. Eu tu, LinkedIn?

I'm still tempted to answer, but no. I can't just bear to answer this how I would want to answer it. Besides, if you know where to look, you might find my answers anyway.

Wednesday, April 03, 2024

An excessive number of packaging layers

I ordered an item from Amazon the other day. The expected arrival time was Friday, but instead, it arrived today. On the front porch was an Amazon box, measuring 6″ × 9″ × 5″ (16cm × 23cm × 13cm for the more civilized amongst you). Inside was another box, 3″ × 4½″ × ⅜″ (7cm × 11cm × 1cm). Inside that was a slightly smaller anti-static bag. Inside that was a smaller plastic bad, and finally, inside that was the item I had purchased—a replacement battery for my old-school flip phone.

Seriously? Four layers of packaging? Sigh.


“Because this kind of battery is encrypted …”

So I'm reading the “Battery Replacement Installation Manual” for the battery I just bought and as translated instructions go, it's not that bad. But there are some choice bits though …

Why does the phone echo?

The echo of the phone may be due to the installation problem. Can you see if there are any loose parts, because the battery will not affect the quality of the phone's call unless there is no power and cause the phone shut down.

“The echo of the phone?”

Feedback? Hearing my own voice echoed back to me? Maybe?

Anyway, carrying on …

Why did I receive a swollen battery?

Because this kind of battery is encrypted …

I have no clue here. It states that swelling may occur if the temperature exceeds 158°F (70°C), and enter sleep mode if the temperature is too low, although it doesn't state what “too low” means. Fortunately, the battery I received isn't swollen, so I guess it's not encrypted?

4. Please carefully check whether there is any debris or screws falling into the battery area. If there is, please clean it up before proceeding to the next step, otherwise the sundries may pierce the battery and cause a short circuit and cause spontaneous combustion.

“Sundries.” Love it!

Thursday, April 04, 2024

Tracking down a bug

I've spent the past two days tracking down a bug, and I think it's a library issue.

So I have this program I wrote some time ago that uses Xlib and for reasons, I needed to store a 64-bit value that's related to a window. This is easy enough with setting a window property. The code for that is easy enough:

void svalue(Display *display,Window window,unsigned long long int value)
{
  assert(display != NULL);
  assert(window  != None);

  XChangeProperty(
    display,
    window,
    CALC_VALUE,
    XA_INTEGER,
    32,	/* format */
    PropModeReplace,
    (unsigned char *)&value,
    sizeof(value) / 4 /* number of 'format' units */
  );
}

CALC_VALUE is the “variable” (for lack of a better term) and XA_INTEGER is (again, for lack of a better term) the base type. Yes, this is just wrapping a single function call in a function, but it's an abstraction to make things simpler as it's called from multiple locations in the codebase.

To query the value:

unsigned long long int qvalue(Display *display,Window window)
{
  assert(display != NULL);
  assert(window  != None);
  
  unsigned long long int  value;
  Atom                    atom_got;
  unsigned char          *plong;
  int                     rc = XGetWindowProperty(
                                  display,
                                  window,
                                  CALC_VALUE,
                                  0,
                                  sizeof(unsigned long long int) / 4,
                                  False,
                                  XA_INTEGER,
                                  &atom_got,
                                  &(int){0}, /* this is don't care */
                                  &(unsigned long int){0}, /* another don't care */
                                  &(unsigned long int){0}, /* another don't care */
                                  &plong
                                );
                     
  if ((rc == Success) && (atom_got == XA_INTEGER))
  {
    memcpy(&value,plong,sizeof(unsigned long long int));
    XFree(plong);
  }
  else
    value = 0;
    
  return value;
}

Again, nothing too horrible or tricky.

The code was originally written on a 32-bit system (just after I left The Enterprise), and it worked. I then wanted to get the program working on a 64-bit system (beacuse I want to both release it and talk about it). It worked, but only for values of 31-bits or less. As soon as the value hit 32-bits, the upper 32-bits were all 1s.

I added code to dump the value just before the call to XChangeProperty() and code to dump the value just after the call to XGetWindowProperty() and somewhere, once the value was 0x00000000FFFFFFFF going into XChangeProperty(), it was 0xFFFFFFFFFFFFFFFF coming out of XGetWindowProperty().

32-bit version? No issues. 64-bit version? Issues.

I tried a different compiler, on the off chance that I might be hitting some weird compiler bug, and no go, GCC or Clang, both on the 64-bit system had the same issue. I tried using a different X server and the same results—32 bit client, fine; 64-bit client, not fine. So I think it's due to the client side on the 64-bit system where the issue lies. Also, if I change the call to XChangeProperty() to:

void svalue(Display *display,Window window,unsigned long long int value)
{
  assert(display != NULL);
  assert(window  != None);

  XChangeProperty(
    display,
    window,
    CALC_VALUE,
    XA_INTEGER,
    8, /* format, this time 8! */
    PropModeReplace,
    (unsigned char *)&value,
    sizeof(value) /* still number of 'format' units */
  );
}

That is, a format of 8 fixed the issue. Even a format of 16 worked. It's just that when I try to use a format of 32, on the 64-bit system, does it fail.

And using a format of 8 on the 32-bit system works as well, so at least I have a workaround for it. Still, it's annoying.


I love it when abstractions are too abstract to be useful

I recently found an annoying aspect of Xlib—it's hard to find documentation about what keys affect the state field of the keyboard event. It's obvious that the shift keys on the keyboard will set ShiftMask, the control key will set ControlMask, and the CapsLock key will set LockMask (when I would expect it to set ShiftMask since it's just locking the shift keys to “on”), but there's little to say what keys set the Mod1Mask, Mod2Mask, Mod3Mask, Mod4Mask and Mod5Mask.

This is problematic, because I do need to check for keyboard events and this threw me for a loop—why are none of the keys working? Well, that's because my virtual Linux server on the Mac sets the NumLock key, which causes the X server to then set the Mod2Mask for all keyboard events and I wasn't expecting that.

Sigh.

Friday, April 05, 2024

Matchbox cars seem to have gotten bigger in recent years

Bunny and I went to a local Toyota dealership to fix an issue with her car (it turns out it was a very unusual, but very minor, issue) and while there, we saw this on the display floor:

[A very small electric car for one] That's not a car!  That's an oversized roller skate!

Turns out, this is not a large Matchbox car, but a small electric car straight from a factory in Japan (the informational flying under the windsheid is all in Japanese). A five year old would barely fit in this thing, much less an adult. There doesn't appear to be any storage space of any significant size, and sans doors, I'm not sure this is even road legal. And the the staff there don't even know if it's for sale. Weird.

Saturday, May 11, 2024

How to measure ⅚ cup of oil, part III

I just received a nice email from Muffintree14 thanking me for helping them make a recipe where they needed to meaure out ⅚ of a cup! They were trying to measure out 200ml of something (they didn't specify what) and it turns out that 200ml is about ⅚ of a cup. I suspect they could have just used a regular cup, as that's 237ml. As long as you aren't baking bread (or other pastry-like food item) then it probably doesn't matter that much. Roughly speaking, 200ml is close enough to 1 cup that you might as well use 1 cup.

But then I found an image (via Bob Anstett on Tik­Linked­My­Face­Pin­Insta­Me­Gram­Space­We­In­Tok­Trest­Book) describing the various relationships among Imperial units, and from there, I found a much better way to meaure ⅚ cups—measure out 1 cup, then remove 8 teaspoons; much better than the 2 ⅓ cup measures (or 1 ⅔ if available), a 1½ tablespoon and a ½ teaspoon. And maybe this will help someone else twenty years down the line.

Monday, May 13, 2024

Remembrance of enlightened palms past

The image at the bottom of this page reminds me of the time I used to photograph enlightened palms, but it never occurred to me that one could enlight trees with fireflies (we don't get fireflies down here in Lower Sheol, which may be the reason why). The pictures I took with the Christmas lights used an exposure of a few seconds; I wonder how long an exposure was used for the firefly photo.


Tesla, Edison, and who actually fought the War of Currents?

I used to think Thomas Edison was a self-aggrandizing business man who took the credit for the inventions his employees made, and Nikola Tesla was the real deal—a genius inventor who was actually responsible for most of our technology based on electricity. But now? Having watched the 4½ hour long video “Most Everything You Know About Nikola Tesla and Thomas Edison is Probably Wrong” (and yes, it's four and a half hours long!) I'm not so sure my assessment is correct. The long video goes deep into the history of Tesla, Edison, and the War of the Currents where it wasn't Tesla vs. Edison, but Westinghouse (the company) vs. Edison (the copmany).

Tesla might have been a genius, but not all this theories about physics and electronics were correct and later in life he went a bit … crazy … to say the least (he fell in love with a pidgeon and said he created incredible inventions without having actually … you know … built the incredible inventions). And Edison might have been a self-aggrandizing business man, but he credited his team and oftem times, his team didn't invent the technology, but improved upon existing designs (to the point where he learned 6,000 ways not to build a lightbulb).

And the whole thing about Edison electrocuting an elephant (or at least animals) to show how dangerous alternating current was? Eh … not exactly. And he did not invent the electric chair.

Yes, it's a long video, but if you are interested at all in Tesla and/or Edison, it's worth the time to watch. It got me to rethink how I think about Tesla and Edison.

Wednesday, May 15, 2024

Extreme Monopoly Board Game Knockoff, Boca Raton edition

About two weeks ago I was at a local Walgreens in Boca Raton when I came across something unusual. I meant to blog about it then, but alas, I just now got a round tuit.

Anyway, what I found:

[A picture of a game clearly based on Monopoly] Everglades University?  Who ever heard of Everglades University?  And in Boca Raton?  I've been in Boca Raton for over 30 years and this is the first I've heard of it! [Back of the box showing the board game and pieces] A pretzel?  Really?  A pretzel?  The original Monopoly pieces are better related to Boca Raton than a preztel!

I amazed this even exists! I wonder who's idea this even was? The Boca Raton Chamber of Commerce?

Anyway, it's clearly a knockoff of Monopoly, as you won't find it for sale at Hasbro. It's actually made by Late for the Sky, which seems to make games based off Monopoly, or should I say, The Landlord's Game which is completely in the public domain (wink wink nudge nudge say no more say no more, unlike Monopoly. But Boca Raton Opoly sure looks like Monopoly, walks like Monopoly, and probably quacks like Monopoly, so I wonder how they get away with this?

Perhaps by flying under the radar of Habro?

Update later this day

Apparently, Hasbro doesn't care:

Leaders at Late for the Sky say Monopoly gameplay is not copyrighted, meaning any version of the game can be created as long as the board, pieces and names within the game are different from the original version.

Via my friend Jeff Cuscutis on Linked­Pin­My­Face­Tik­Insta­Me­Trest­We­Gram­Book­In­Tok­Space, Business making Monopoly games based on Carolina towns

Monday, May 27, 2024

How does TLS use less CPU than plain TCP?

I have two services written in Luaa gopher server and a Gemini server. The both roughly serve the same data (mainly my blog) and yet, the gopher server accumulates more CPU time than the Gemini server, despite that the Gemini server uses TLS and serves more requests. And not by a little bit either:

CPU utilization
gopher 17:26
Gemini 0:45

So I started investigating the issue. It wasn't TCP_NODELAY (via Lobsters) as latency wasn't the issue (but I disabled Nagle's algorithm anyway).

Looking further into the issue, it seemed to be one of buffering. the code was not buffering any data with TCP; furthermore, the code was issuing tons of small writes. My thinking here was—Of course! The TCP code was making tons of system calls, whereas the TLS code (thanks to the library I'm using) must be doing buffering for me.

So I added buffering to the gopher server, and now, after about 12 hours (where I restarted both servers) I have:

new CPU utilization
gopher 2:25
Gemini 2:13

I … I don't know what to make of this. Obviously, things have improved for gopher, but did I somehow make Gemini worse? (I did change some low level code that both TCP and TLS use; I use full buffering for TCP, no buffering for TLS). Is the load more evenly spread?

It's clear that gopher is still accumulating more CPU time, just not as bad as it was. Perhaps more buffering is required? I'll leave this for a few days and see what happens.

Thursday, May 30, 2024

Profile results are as expected as the Spanish Inquisition

I'm not upset at rewriting the code that handles the network buffering as it needed work, but I'm still seeing a disporportionate amount of CPU time accumluate on the supposedly simpler protocol gopher. The most popular requests of both my gopher server and Gemini server are entries from my blog, so I take a look at the code that handles such requests for both servers. Yes, the gohper server has a bit more code dealing with links than the Gemini server (because gopher URLs are almost, but not entirely, like http URLs—and the small differences are annoying), but I'm not seeing anything that stands out. Yes, the code is not quite twice as much, but the CPU utilization is more than three times as much (as of writing this).

I have no other choice at this point and I constantly relearn this lession over and over again: if I'm looking into a performance issue, profile the code under question! Profile profile profile!

The code is in Lua and as it happens, I've profiled Lua code before. First, I want to answer this question: how much code does it take to serve a request? And I figure measuring the lines of code run is a good answer to that. I can get a baseline from that. And the code to answer that is a very easy four line change to each server:

local function main(iostream)
  local count = 0
  debug.sethook(function() count = count + 1 end,'line')

  -- The rest of the main code

  debug.sethook()
  syslog('notice',"count=%d",count)
end

I fire up the servers locally, make a decently sized request to each, and I get my results:

Lines of code to serve a request
gopher 457035
gemini 22661

WHAT THE LITERAL XXXX

[Well, there's your problem! —Editor] [Just … gaaaaaaah! —Sean]

I'm constantly surprised at the results of profiling—it's almost never what I think the problem is. And here, it's clear that I messed up pretty bad somewhere in the gopher code.

Now off to more profiling to see where it all goes pear shaped.


Unicode. Why did it have to be Unicode?

Well, I have my answer. I first found a smaller request that exhibits the behavior as to not generate half a million lines of output:

Lines of code executed for a small request
gopher 2549
gemini 564

Good. Two and a half thousand lines of code is tractable. Now, just to show how easy it is to profile Lua code, here's the code I'm using for my gopher server:

local profile = {}

local function doprofile()
  local info = debug.getinfo(2)
  local name = info.name or "..."
  local file = info.source or "@"
  local key  = string.format("%s$%s(%d)",file,name,info.currentline)
  if not profile[key] then
    profile[key] = 1
  else
    profile[key] = profile[key] + 1
  end
end

For each line of code executed, we get the filename, the function name and the line of code that's executing, turn that into a key, and use that to count the number of times that line of code is executed. Easy. And then some code to dump the results:

local function cleanup()
  local results = {}
  for name,value in pairs(profile) do
    results[#results + 1] = { file = name , count = value }
  end
  
  table.sort(results,function(a,b)
    if a.count > b.count then
      return true
    elseif a.count < b.count then
      return false
    else
      return a.file < b.file
    end
  end)
  
  local f = io.open("/tmp/dump.txt","w")
  for i = 1 , #results do
    f:write(string.format("%6d %s\n",results[i].count,results[i].file))
  end
  f:close()  
end

We sort the results based on line count, then alphabetically by key. And like before:

local function main(iostream)
  debug.sethook(doprofile,'line')

  -- The rest of the main code

  debug.sethook()
  cleanup()
end

I make the request and get some results:

   215 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(202)
   211 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(203)
   211 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(204)
   211 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(268)
   210 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(282)
   210 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(283)
   169 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(219)
   169 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(224)
   169 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(239)
   169 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(240)
    42 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(205)
    42 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(206)
    42 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(207)
    17 @port70.lua$...(272)
    17 @port70.lua$...(273)
     9 @/usr/local/share/lua/5.4/org/conman/net/ios.lua$write(547)
...

Oh.

Yeah.

That.

Obvious in hindsight.

I completely forgot about that.

Okay. The function in question, wrapt(), wraps text and it's a rather heavy function due to Unicode (and I'm not even following the full specification there). This is the major difference between the gopher and Gemini servers—I don't wrap text for Gemini (the clients handle that). I guess I'll have to drop down to C if I want to speed this up.

Sigh.

Saturday, June 01, 2024

Falettinme Be Mice Elf Ag

Bunny and I were in my car when this song from the Sly and the Family Stone came on. On the screen was the title, “Thank You (Falettinme Be Mice Elf Ag)”. Bunny attempted to read it and was having a hard time with it; I suggested that perhaps XM Radio had used some form of audio transcription that went wrong somewhere, because Sly and the Family were clearly singing “thank you for letting me be myself.”

I was about to make a satirical post about it, but when I was looking up the song … well … that's the actual name! Okay, it's actually “Thank You (Falettinme Be Mice Elf Agin)” (the screen wasn't wide enough for the entire title). But still … that's the actual name of the song! Sheesh!

Monday, June 03, 2024

Just a simple matter of replacing a slow Lua function with a faster C function

I spent the past few days rewriting some Lua code into C. While I find LPEG to be convenient, it is not necessarily fast. Normally this isn't an issue but in this case, I was calling LPEG for each character in a blog post.

Fortunately, it was fairly straight forward porting the code to C. The code goes through text a character codepoint at a time. If it's a whitespace character or a hyphen, I mark the current position as a possible breakpoint for the text; otherwise I ignore combining characters (they don't count towards the line length). Then, when I reach past the number of characters I want for a line, I copy out the string from the beginning of the “line” to the marked breakpoint (and if there isn't a breakpoint, there is no good place to break the line so I will break the line at the line length—not much else to do), then mark the beginning of the next line and continue until the end of the text.

The hardest part was figuring out how to classify each character I needed. In the end, I pull out each Unicode codepoint from UTF-8 and look through an array to classify the codepoint as whitespace, a hyphen or a combining character; if they aren't in the table, it just a normal character.

As a sanity check, I reran the original profiling test:

Lines of Lua code executed to serve a request
gopher (original) 457035
gopher (new) 18246
gemini (just because) 22661

Much better. And most of the 457,035 lines of code being executed are now hidden behind C. Now to make sure the code is actually faster, I profiled the new wrapt() function:

local wraptx = wrapt
local function wrapt(...)
  local start = rdtsc()
  local res   = wraptx(...)
  local stop  = rdtsc()
  syslog('notice',"wrapt()=%d",stop-start)
  return res
end

with the decently sized request I used before (each line is a call to wrapt()):

Runtime (lower is better)
#Lua code C code
43330 11810
43440 12000
45300 12220
48100 12020
48680 13690
49260 12650
54140 12270
54650 12460
58530 12130
59760 14180
61100 15480
65440 14970
67920 15810
68750 15310
69920 17170
69960 17780
70740 16510
75640 16750
78870 19170
83200 18190
87090 17290
89070 23360
91440 19560
101800 21520
102460 21060
103790 22180
106000 22400
106010 21870
112960 21160
115300 21870
115980 23130
118690 24980
122550 23960
122710 24550
127610 23830
129580 24670
130120 24930
140580 26570
141930 25210
157640 27050
168000 32250

Excellent! The new code is three to five times faster. Now to just sit back and see how the new code fares over the next few days.

Wednesday, June 05, 2024

I wonder who isn't getting the point

Yes, there are a few groups I follow on Discord. One of the groups has a link to another site I want to view and I click on it. I get:

Leaving Discord

This link is taking you to the following website

https://github.com/XXXXX­XXXXX­XXXXX­XXXXX­XXXXX­XXXXX­X

Trust github.com links from now on

God, I just love XXXXX­XX wall gardens! I just love how it's there to protect me. You mean … a link … goes to a non-Discord site? Oh my God! Thank you for warning me! My computer might have been pwned by Microsoft!

Sheesh.

Thursday, June 06, 2024

Stats on some optimized small internet servers

I restarted my Gopher and Gemini servers about 25 hours ago to ensure they're both running the new optimized code, so now let's check some results:

Gopher/Gemini server stats
service CPU utilization requests bytes
gopher 2:42 935 1186591
gemini 8:58 36394 249020812

That's more like it. The gopher server, running over plain TCP and getting about 1/40th the requests, is finally accumulating less CPU time than Gemini (and the only reason it's not even less is my gopher server has to deal with wraping Unicode text).

I also find it amusing that Gemini, a protocol that has only been around for five years, is way more popular than gopher, a protocol that's been around for thirty-three years. I guess Gemini being the new shiny is the reason.

Wednesday, June 12, 2024

Just when I thought it was safe to run a gopher server

It appeared I had traded the problem of high CPU usage for a memory leak. Last night I saw:

USER       PID %CPU %MEM     VSZ     RSS TTY   STAT START TIME COMMAND
gopher   31875  0.2 13.3 2174076 2171960 ttyp4 S+   Jun09 7:08 lua port70.lua /home/spc/gopher/config.lua

(For the record—the VSZ and RSS values are the number of 4,096 byte blocks of memory for various reasons beyond the scope of this post) and just before lunch today:

USER       PID %CPU %MEM     VSZ     RSS TTY   STAT START TIME COMMAND
gopher    1572  0.2 11.1 1809672 1807644 ttyp4 S+   04:30 1:07 lua port70.lua /home/spc/gopher/config.lua

Not good. Especially when it's normally like:

USER       PID %CPU %MEM   VSZ  RSS TTY   STAT START TIME COMMAND
gopher   20711  0.2  0.0 10588 9000 ttyp4 S+   17:52 0:06 lua port70.lua /home/spc/gopher/config.lua

And then there was this in the logs:

Jun 12 09:47:54	daemon	err	71.19.142.20	gopher	CRASH: coroutine thread: 0x8400404 dead: not enough memory
Jun 12 09:47:54	daemon	err	71.19.142.20	gopher	CRASH: thread: 0x8400404: stack traceback:
Jun 12 09:47:54	daemon	err	71.19.142.20	gopher	CRASH: thread: 0x8400404:  [C]: in function 'org.conman.string.wrapt'
Jun 12 09:47:54	daemon	err	71.19.142.20	gopher	CRASH: thread: 0x8400404:  ...e/lua/5.4/org/conman/app/port70/handlers/blog/format.lua:34: in upvalue 'wrap_text'
Jun 12 09:47:54	daemon	err	71.19.142.20	gopher	CRASH: thread: 0x8400404:  ...e/lua/5.4/org/conman/app/port70/handlers/blog/format.lua:119: in upvalue 'run_flow'
Jun 12 09:47:54	daemon	err	71.19.142.20	gopher	CRASH: thread: 0x8400404:  ...e/lua/5.4/org/conman/app/port70/handlers/blog/format.lua:598: in function 'org.conman.app.port70.handlers.blog.format'
Jun 12 09:47:54	daemon	err	71.19.142.20	gopher	CRASH: thread: 0x8400404:  ...al/share/lua/5.4/org/conman/app/port70/handlers/blog.lua:227: in function <...al/share/lua/5.4/org/conman/app/port70/handlers/blog.lua:210>
Jun 12 09:47:54	daemon	err	71.19.142.20	gopher	CRASH: thread: 0x8400404:  (...tail calls...)
Jun 12 09:47:54	daemon	err	71.19.142.20	gopher	CRASH: thread: 0x8400404:  port70.lua:238: in function <port70.lua:205>

Yeah, “not enough memory” while running the new C-based text wrapping function. I added a few checks to see if wrapt() wasn't updating the character indicies correctly (and log if they weren't) then went for lunch. Afterwards, I check and lo', memory was still growing, and none of the checks I added had triggered.

Hmmm. Time to take down the gopher server and debug some code.

Sure enough, I found the one entry that went off the rails. What followed was a few hours of troubleshooting to find out why that one entry (and only that entry) blows up. And it came down to a difference of semantics between Lua and C (no, it had nothing to do with C's 0-based indexing and Lua's 1-based indexing). Basically, the Lua code looked like:

local breakhere -- this is nil

...

if ctype == 'space' then
  breakhere = i -- this is now an integer
  ...
end

if breakhere then
  table.insert(result,s:sub(front,breakhere - 1))
  ...
end

breakhere = nil

In Lua, only the values nil and false are considered “false”—a 0 value is “true” in Lua. In C, any 0 value is considered “false.” I knew this when translating the code, but it never occurred to me that a break point of a line could be at the start of a line.

Until it was at the start of a line in that one entry—the code went into an infinite loop trying to wrap text, thus causing the memory usage to consume the computer.

Sigh.

It was easy enough to fix once the problem was diagnosed. And maybe now things will get back to normal.

Monday, June 17, 2024

When a file doesn't exist, yet it's right there

For reasons, I'm downloading a file from several hundred different domains. The results start trickling in and I decide to take a look at one of them:

[spc]lucy:/tmp/cache>ls 04d.co
04d.co
[spc]lucy:/tmp/cache>more 04d.co
Error: File or directory not found!
[spc]lucy:/tmp/cache>

Um … what?

Okay. There could be an invisible character or two in the filename.

[spc]lucy:/tmp/cache>echo * | hex | more
00000000: 30 34 64 2E 63 6F 20 31 34 33 36 2E 6E 69 6E 6A 04d.co 1436.ninj
...
[spc]lucy:/tmp/cache>

Nope. But what could cause the shell not to find the file when the file is right there! The name shouldn't matter. But I can't view it.

[spc]lucy:/tmp/cache>cat 04d.co
Error: File or directory not found!
[spc]lucy:/tmp/cache>

Okay, what if I try to move it somewhere?

[spc]lucy:~/tmp>cp /tmp/cache/04d.co .
[spc]lucy:~/tmp>more 04d.co
Error: File or directory not found!
[spc]lucy:~/tmp>

I can list the file:

[spc]lucy:~/tmp>ls -l 04d.co
-rw-r--r--  1 spc spc 37 Jun 17 22:25 04d.co
[spc]lucy:~/tmp>

Let me try one last thing …

[spc]lucy:~/tmp>hex 04d.co
00000000: 45 72 72 6F 72 3A 20 46 69 6C 65 20 6F 72 20 64 Error: File or d
00000010: 69 72 65 63 74 6F 72 79 20 6E 6F 74 20 66 6F 75 irectory not fou
00000020: 6E 64 21 0D 0A                                  nd!..           

I was downloading these files from the Intenet. This particular site didn't have the file. The contents of the file is the error message.

Head. Meet desk.

This reminds me of college. Back then, the command finger was popular, and when using it, the program would attempt to locate a file called .plan in the home directory and display it. I made a joke .plan file that read:

Bus error - core dumped

You have new mail.

I can't count the number of times someone tried to finger me, only to attempt to remove a non-existent core file and impulsively check their email.

Sunday, June 23, 2024

How does TLS use less CPU than plain TCP, part II

I noticed over the past couple of days that the CPU utilization were similar between the two services, but looking at the logs it seemed my gopher server git hit with several bots. So last night just as the logs reset (a weekly job) I decided to restart the two services so as to match CPU utilization and requests. Twelve hour later and …

CPU utilization after 12 hours
CPU requests
gopher 0:28 175
gemini 0:26 1744

A week and a half after calling “Mission Complete” and apparently not. Even worse, my gopher server is using a bit more CPU than Gemini server while getting one-tenth the number of requests. Something is off here.

I checked the requests and both receive over 90% requests for my blog, so I'm confident that it's something to do with that. Both use the same request framework, both use the same code to read blog entries, both use the same code to parse the HTML, so it's not that. The only difference left is the formatting of each entry. The gopher server does wrap the text to 80 columns (not needed for the Gemini server), and there's quite a bit more code to deal with fixing the links … are those the causes? It's hard to determine what's causing the excessive CPU utilization.

I'm going to have to think about how to determine the issue.

Heck, I'm going to have to think if this is even worth determining. I mean, the CPU utilization is better, but it's still puzzling.


The case of the weird profiling results

Okay, I'm flummoxed.

After a bit of thought, I had an idea on how to determine where the CPU time is going on my gopher server—I have at least two working hypotheses that I can test, wraping text and translating links. So, to test these, I will get a list of requests since the logs rotated earlier today, and then run them through the gopher server (a second one set up on my public server just for these tests) to get a baseline; then remove the link translations and run the requests again; then remove wraping the text and run the requests a third time and see what I get.

For the first test, the expected results should result with the test gopher server having the same CPU utilization as the one that normally runs. It makes sense right—the normal one has x CPU utilization with y requests, so if I feed the test gopher server the same requests (even though all the requests come from the same location) it should have the same CPU utilization. Or at least close enough to show that I'm on the right track.

When I pulled the requests out of the logs files and checked the current gopher server, it had received 332 requests and racked up 0:47 in CPU time. I set up the test gopher server (the only changes to the configuration—different log file and a different port which meant I didn't have to run as root). I made the 332 requests to my test gopher server and I get a CPU time of 0:22.

What?

I reran this test a few times and got the same results each time—0:22.

Which is less than half the time the normal gopher server with the same number of requests. If it was one or two seconds off, hey, close enough. But half?

It's not making sense.

But I decided to coninue and run the other tests. First, I removed the code that does the link translation, rerand the requests and got a CPU time of 0:20.

Okay, that tell me two things—one, the link translations do take time, but I don't think it's enough to explain the CPU utilization of the gopher server. Maybe. On a whim, I decided to change the third test to not even bother with processing blog entries—I modified the blog entry handler to just return. Given that ¾ of the requests are to the blog, this should make it run much faster and use less CPU time.

I got a CPU time of 0:18.

I don't know what to think. I would have expected this to be 0:05 or 0:06, given that 75% of the requests would not be generated. Something weird is going on.

It was time to take a step back. I went back to the original bench mark but instead of counting lines executed, I decided to count Lua VM instructions for the decently sized request.

Lua VM cycles to serve a request
gopher 61622
gemini 67401

Okay, the gopher server is clearly doing less Lua VM instructions than the Gemini server. Could I gain any insight from profiling at the C level? I had already done most of the work to profile both the gopher and Gemini servers. To make sure I got enough data, I ran the decently sized request three times for each server.

port70 (gopher server)—Each sample counts as 0.01 seconds.
% time cumulative seconds self secondscalls self seconds ms/call total ms/call name
% time cumulative seconds self seconds calls self seconds ms/call total ms/call name
13.79 0.04 0.04 185781 0.00 0.00 luaV_execute
10.34 0.07 0.03 734588 0.00 0.00 index2value
6.90 0.09 0.02 398225 0.00 0.00 luaS_new
6.90 0.11 0.02 45028 0.00 0.00 luaH_newkey
3.45 0.12 0.01 1041939 0.00 0.00 yymatchChar
3.45 0.13 0.01 924394 0.00 0.00 luaG_traceexec
3.45 0.14 0.01 503889 0.00 0.00 yyText
3.45 0.15 0.01 260252 0.00 0.00 luaD_precall
3.45 0.16 0.01 245893 0.00 0.00 mainpositionTV
3.45 0.17 0.01 201753 0.00 0.00 auxgetstr
3.45 0.18 0.01 191931 0.00 0.00 yy_S
3.45 0.19 0.01 185373 0.00 0.00 equalkey
3.45 0.20 0.01 134088 0.00 0.00 yyDo
3.45 0.21 0.01 129432 0.00 0.00 yy_CHAR
3.45 0.22 0.01 101937 0.00 0.00 reverse
3.45 0.23 0.01 34759 0.00 0.00 luaH_getn
3.45 0.24 0.01 4473 0.00 0.00 getfirst
3.45 0.25 0.01 1625 0.01 0.01 traverseproto
3.45 0.26 0.01 834 0.01 0.01 strcore_wrapt
3.45 0.27 0.01 61 0.16 0.16 checkloops
3.45 0.28 0.01 2 5.00 5.00 deletelist
3.45 0.29 0.01       cclasscmp
0.00 0.29 0.00 924377 0.00 0.00 luaD_hook
0.00 0.29 0.00 391575 0.00 0.00 yymatchString
0.00 0.29 0.00 358374 0.00 0.00 luaH_getshortstr
0.00 0.29 0.00 261889 0.00 0.00 prepCallInfo
0.00 0.29 0.00 261296 0.00 0.00 luaD_poscall
GLV-1.12556 (Gemini server)—Each sample counts as 0.01 seconds.
% time cumulative seconds self seconds calls self seconds ms/call total ms/call name
% time cumulative seconds self seconds calls self seconds ms/call total ms/call name
9.38 0.03 0.03 1011065 0.00 0.00 luaG_traceexec
9.38 0.06 0.03 1011056 0.00 0.00 luaD_hook
7.81 0.09 0.03 204707 0.00 0.00 luaV_execute
6.25 0.10 0.02 443861 0.00 0.00 luaS_new
6.25 0.12 0.02 396470 0.00 0.00 luaH_getshortstr
3.12 0.14 0.01 668980 0.00 0.00 index2value
3.12 0.14 0.01 391575 0.00 0.00 yymatchString
3.12 0.15 0.01 271008 0.00 0.00 mainpositionTV
3.12 0.17 0.01 243043 0.00 0.00 luaD_precall
3.12 0.17 0.01 242834 0.00 0.00 moveresults
3.12 0.18 0.01 217137 0.00 0.00 ccall
3.12 0.20 0.01 202203 0.00 0.00 hookf
3.12 0.20 0.01 129432 0.00 0.00 yy_CHAR
3.12 0.21 0.01 56698 0.00 0.00 llex
3.12 0.23 0.01 39543 0.00 0.00 internshrstr
3.12 0.23 0.01 30301 0.00 0.00 luaM_malloc_
3.12 0.24 0.01 23821 0.00 0.00 luaH_realasize
3.12 0.26 0.01 5906 0.00 0.00 luaV_concat
3.12 0.27 0.01 4149 0.00 0.00 GCTM
3.12 0.28 0.01 1872 0.01 0.02 yy_PCDATA
3.12 0.28 0.01 928 0.01 0.01 match
3.12 0.29 0.01 787 0.01 0.01 luaF_newLclosure
3.12 0.30 0.01 595 0.02 0.02 luaK_int
3.12 0.32 0.01 59 0.17 0.17 l_strcmp
1.56 0.32 0.01 12 0.42 0.42 luaV_finishOp
0.00 0.32 0.00 1041939 0.00 0.00 yymatchChar

This is not easy to interpret. As expected, the Lua VM shows up in the top spots for both, but nothing really stands out. It is nice to see that yymatchChar and yymatchString (both in the HTML parsing module) are called the same number of times (expected) but the times are different. The Lua function index2value is called a different number of times, but that might be due to code differences. I think the higher percentage of time in the gopher server might be due to taking less time overall? For instance, in the gopher server, deletelist has a self ms/call of 5, but on the Gemini server it has a self ms/call of 0.52—is the garbage collector being called more often in the gopher server? Or is it because of less time overall, it shows up with a higher time? It's only called twice in both codebases. I'm not sure how to interpret this.

Somehow, the gopher server is doing less work than the Gemini server, yet accumulating more CPU time than the Gemini server, despite getting about 10% of the requests as the Gemini server. I don't understand how that can be.

I think I'll leave things alone for now. I went several years with a not-optimized gopher server, so I think I can live with this new version for now.

Thursday, June 27, 2024

IT HAS BEGUN! (“It” being Christmas themed advertising)

Last night at the Cracker Barrel, Bunny and I saw store displays nominally for Halloween. But nestled in the middle of the display was an evergreen tree with ornaments hanging off it. Try as you might, Cracker Barrel, there is no such thing as a Halloween Tree (no disrepect, Mr. Bradury)—it's still a Chrismtas Tree.

And then today I saw what is to me, the first official advertisement for Christmas—a trailer for a movie called “Red One,” a muddy looking CGI slugfest where Santa Clause has been kidnapped and needs to be saved.

The movie isn't even slated to be released until November! Couldn't they have at least waited until October to start advertising it? No?

Sigh.

Sunday, June 30, 2024

Four lines of code … it was four lines of code

It was bugging me.

I checked last night, and found the following:

CPU utilization after a week
  CPU requests
gopher 28:37 8360
gemini 6:02 24345

How can a TCP service take more time than a TLS service? Especially when TLS runs over TCP? And TLS generates six times the packets than TCP?

There's some difference in how I handle TCP and TLS that's causing this. Profiling the code didn't reveal much as my attempts kept showing that gopher should be using less CPU than Gemini, so then it was just a matter of staring at code and thinking hard.

It wasn't easy. Yes, there are some differences between the TCP interface and the TLS interface, so it was a matter of going through the code, bringing them closer in implementation and seeing if that fixed things. I would fix the code, run a test, see it not make a difference, and try again. Eventually, I came across this bit of code:

-- ---------------------------------------------------------------------
-- It might be a bug *somewhere*, but on Linux, this is required to get
-- Unix domain sockets to work with the NFL driver.  There's a race
-- condition where writting data then calling close() may cause the
-- other side to receive no data.  This does NOT appoear to happen with
-- TCP sockets, but this doesn't hurt the TCP side in any case.
-- ---------------------------------------------------------------------
    
while self.__socket.sendqueue and ios.__socket.sendqueue > 0 do
  nfl.SOCKETS:update(self.__socket,'w')
  coroutine.yield()
end

This bit of code wasn't in the TLS implementation, and as the comment says, it “shouldn't” hurt the TCP path, but hey, it was only needed for local (or Unix) domain sockets to begin with, so let me try removing that bit of code and run a test.

Test before removing the code, 275 requests; time in seconds
  CPU real user sys
gopher 18 46.726 13.971 4.059
gemini 4 107.928 3.951 0.322
Test after removing the code, 275 requests; time in seconds
  CPU real user sys
gopher 0 52.403 0.635 0.185
gemini 4 103.290 3.957 0.285

THAT'S more like it!

Now, about that code … this is my best guess as to what was going on.

The sendqueue field is a Linux-only field, so I'm checking to see if it exists, and then checking the value (if only to avoid the “attempt to compare number with nil” Lua error on non-Linux systems). This returns the number of bytes the kernel has yet to send, so we switch the socket to trigger an event when we can write to it, and yield. Even worse, the check isn't just a simple variable fetch, but ends up being an ioctl() system call. Twice!

Now the “write” event code:

if event.write then
  if #ios.__output > 0 then
    local bytes,err = ios.__socket:send(nil,ios.__output)
    if err == 0 then
      ios.__wbytes = ios.__wbytes + bytes
      ios.__output = ios.__output:sub(bytes + 1,-1)
    else
      syslog('error',"socket:send() = %s",errno[err])
      nfl.SOCKETS:remove(ios.__socket)
      nfl.schedule(ios.__co,false,errno[err],err)
    end
  end
  
  if #ios.__output == 0 then
    nfl.SOCKETS:update(ios.__socket,'r')
    nfl.schedule(ios.__co,true)
  end
end

There's probably no data to be sent (ios.__output is empty) so we immediately switch the socket back to trigger for readability, then resume the coroutine handling the socket. It goes back to the while loop, setting the socket back to trigger on writability, and kept bouncing like that until the kernel buffer had been sent.

As I wrote, this fixed an issue I had with local domain sockets—it wasn't an issue with an IP socket, and in fact, was the wrong thing to do for an IP socket.

And as it is with these types of bugs, finding the root cause is not trivial, but the fix almost always is.


Discussions about this entry

Thursday, July 04, 2024

Something something Independence something something Declaration something something Blow Stuff Up

It just now became the Forth of July™ and earlier this evening (which was technically yesterday, Wednesday, July 3rd) some people were illegally lighting off fireworks already. I guess as a random quality test of the munitions they've aquired. So to them, and to you, I wish you a happy and safe National Blow Stuff Up Day!

[Amateurs: There's a reason professionals exist.]

And for the curious, the above image was taken by me 21 years ago. Yikes!

And it was 13 years ago when my hair nearly caught on fire, so when I say “be safe,” I mean it.

Sunday, July 21, 2024

Fixing an Apache pthread error

After I fixed the performance issue, I started looking around for some other issues to handle, and boy, did I find some. Checking the error log from Apache I found:

[Fri Jul 12 15:04:01.762845 2024] [mpm_worker:alert] [pid 31979:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
libgcc_s.so.1 must be installed for pthread_cancel to work
[Fri Jul 12 15:04:02.731958 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 31979 exit signal Aborted (6)
[Fri Jul 12 15:04:02.735360 2024] [mpm_worker:alert] [pid 32021:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
libgcc_s.so.1 must be installed for pthread_cancel to work
[Fri Jul 12 15:04:03.733536 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32021 exit signal Aborted (6)
[Fri Jul 12 15:04:03.736857 2024] [mpm_worker:alert] [pid 32063:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
libgcc_s.so.1 must be installed for pthread_cancel to work
[Fri Jul 12 15:04:04.735368 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32063 exit signal Aborted (6)
[Fri Jul 12 15:04:04.738624 2024] [mpm_worker:alert] [pid 32105:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
libgcc_s.so.1 must be installed for pthread_cancel to work
[Fri Jul 12 15:04:05.737141 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32105 exit signal Aborted (6)
[Fri Jul 12 15:04:05.740622 2024] [mpm_worker:alert] [pid 32147:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
libgcc_s.so.1 must be installed for pthread_cancel to work
[Fri Jul 12 15:04:06.739077 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32147 exit signal Aborted (6)
[Fri Jul 12 15:04:06.742500 2024] [mpm_worker:alert] [pid 32189:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
libgcc_s.so.1 must be installed for pthread_cancel to work
[Fri Jul 12 15:04:07.740898 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32189 exit signal Aborted (6)
[Fri Jul 12 15:04:07.744130 2024] [mpm_worker:alert] [pid 32231:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
libgcc_s.so.1 must be installed for pthread_cancel to work
[Fri Jul 12 15:04:08.742689 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32231 exit signal Aborted (6)
[Fri Jul 12 15:04:08.745885 2024] [mpm_worker:alert] [pid 32274:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
libgcc_s.so.1 must be installed for pthread_cancel to work
[Fri Jul 12 15:04:09.744445 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32274 exit signal Aborted (6)
[Fri Jul 12 15:04:09.747607 2024] [mpm_worker:alert] [pid 32316:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
libgcc_s.so.1 must be installed for pthread_cancel to work

Thousands of such lines. And the weird thing is that libgcc_s.so.1 does exist on my system. I guess I missed that library when I installed Apache from source. I'm not even curious as to why this is an issue, nor why libgcc_s.so.1 is needed. At this point I'm like “give the system what it wants, not what I want to give it” (which I can't say in polite company). Running configure –help didn't show any obvious means of enabling the use of libgcc_s.so.1, and not wanting to dive deep into a maze of twisty Makefiles all alike, I decided on the next best thing. I went to the Apache build directory, deleted the existing httpd binary and ran make. This gave me the final line used to build the executable:

/home/spc/apps/httpd-2.4.54/srclib/apr/libtool --silent --mode=link gcc -std=gnu99  -g -O2 -pthread -o httpd  modules.lo buildmark.o -export-dynamic server/libmain.la modules/core/libmod_so.la modules/http/libmod_http.la server/mpm/worker/libworker.la os/unix/libos.la -L/usr/local/lib -lpcre    /home/spc/apps/httpd-2.4.54/srclib/apr-util/libaprutil-1.la -lexpat /home/spc/apps/httpd-2.4.54/srclib/apr/libapr-1.la -luuid -lrt -lcrypt -lpthread -ldl

I then reran just this command but with /lib/libgcc_s.so.1 added before the -pthread option. A quick check afterwards:

[spc]brevard:~/apps/httpd-2.4.54>ldd .libs/httpd
	linux-gate.so.1 =>  (0xf77e1000)
	libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf77cf000)
	libpcre.so.1 => /usr/local/lib/libpcre.so.1 (0xf77b1000)
	libaprutil-1.so.0 => /usr/local/apache2/lib/libaprutil-1.so.0 (0xf778c000)
	libexpat.so.0 => /usr/lib/libexpat.so.0 (0xf776e000)
	libapr-1.so.0 => /usr/local/apache2/lib/libapr-1.so.0 (0xf7742000)
	libuuid.so.1 => /lib/libuuid.so.1 (0xf773f000)
	librt.so.1 => /lib/tls/librt.so.1 (0xf772a000)
	libcrypt.so.1 => /lib/libcrypt.so.1 (0xf76fc000)
	libpthread.so.0 => /lib/tls/libpthread.so.0 (0xf76ea000)
	libdl.so.2 => /lib/libdl.so.2 (0xf76e6000)
	libc.so.6 => /lib/tls/libc.so.6 (0xf75bb000)
	/lib/ld-linux.so.2 (0xf77e2000)

and it worked—libgcc_s.so.1 is linked in!

I installed the new Apache executable, restarted Apache and … it didn't crash! I then left it to run.

That was a week ago, and so far, so good. The error log is rotated weekly and for the past week no such errors have appeared. Now only if I could nuke from orbit the crawlers sending in silly requests like /cgi-bin/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/bin/sh—it just clutters up the error log with “invalid URI path” errors.

Sigh.

Saturday, July 27, 2024

Fixing more Apache errors

A week later and I can't say I cleared up all the errors with my web server:

[Sat Jul 27 09:51:45.349454 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.349617 2024] [cgid:error] [pid 7348:tid 3807226800] [client 192.200.113.155:55207] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.350209 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.350297 2024] [cgid:error] [pid 7348:tid 3807226800] [client 192.200.113.155:55207] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.352660 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.352814 2024] [cgid:error] [pid 7636:tid 3815619504] [client 192.200.113.155:49997] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.353377 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.353462 2024] [cgid:error] [pid 7636:tid 3815619504] [client 192.200.113.155:49997] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.353790 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.353943 2024] [cgid:error] [pid 7691:tid 3832404912] [client 192.200.113.155:48697] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.354685 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.354813 2024] [cgid:error] [pid 7691:tid 3832404912] [client 192.200.113.155:48697] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.360184 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.360295 2024] [cgid:error] [pid 7349:tid 3731692464] [client 192.200.113.155:44083] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.360856 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.360940 2024] [cgid:error] [pid 7349:tid 3731692464] [client 192.200.113.155:44083] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.366567 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.366719 2024] [cgid:error] [pid 7786:tid 3916331952] [client 192.200.113.155:55205] End of script output before headers: boston.cgi

There are more entries like this, but you get the idea. Apache can't run mod_blog for some reason. Checking the access log I can match these up to the following requests:

XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/16/hsr-carpet-1.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/15/flapper.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/17/scrapes.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/16/hsr-carpet-2.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/17/luxor.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/rushhour.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/area51.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/littlealeinn.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/quik-pik.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)

To me, this is obviously a crawler, despite claiming to be every possible web browser in existance—is it Windows? MacOS? Linux? Yes. But what's interesting is that all the errors seem related to serving up images.

The way my blog works, all requests to posts are fed through mod_blog, and if said request is for an image, it just copies the file out. It works, but if the server gets slammed just a bit too hard, it breaks down. If only there was some way to get Apache to serve the images directly instead of having to go through mod_blog.

Wait! There is!

I've been using Apache for well over twenty-five years now, so it was a relatively easy issue to solve. First off, point Apache to the directory where all the data for mod_blog is stored.

Alias                 /XXXXX/ /home/spc/web/sites/boston.conman.org/journal/

<Directory /home/spc/web/sites/boston.conman.org/journal>
  Options       None
  AllowOverride None
  <LimitExcept GET HEAD>
    Require valid-user
  </LimitExcept>
</Directory>

The first directive maps the “web directory” /XXXXX/ to the actual directory on the file system. The Directory block restricts what can be viewed and how it can be viewed. All that's left is to throw all requests to images to this directory:

RewriteRule ^([0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]/.*\.(gif|png|jpg|ico)) XXXXX/$1 [L]

What this does is rewrites a request like /2015/07/04/Desk.jpg to /XXXXX/2015/07/04/Desk.jpg, which references the image directly on the file system, letting Apache serve it up directly. This rule goes before the other RewriteRules so Apache serves the image up before mod_blog sees the request.

An easy fix that should lighten the load on Apache as it serves up my blog. I'll see in a week if it all goes to plan.

Sunday, July 28, 2024

Notes on an overheard conversation while pulling into the driveway

“You know, it's not a song unless you can sing the lyrics to it.”

“There are lyrics!”

“Oh yes? Then sing them!”

“Okay, granted—I can barely hear them, but they're there!”

“Well, can you carry this into the house?”

“Sigh. Okay, I will, even though you don't seem to appreciate the finer things in life.”

“It's still not a song.”

“Philistine!”


The case of the well-known location being denied when it doesn't exist

I was checking up on the Apache error log when I noticed the following:

[Sun Jul 28 18:47:21.455848 2024] [authz_core:error] [pid 25597:tid 3916331952] [client 74.173.118.3:53916] AH01630: client denied by server configuration: /usr/local/apache2/htdocs/.well-known
[Sun Jul 28 18:47:59.176743 2024] [authz_core:error] [pid 25598:tid 3916331952] [client 74.173.118.3:53918] AH01630: client denied by server configuration: /usr/local/apache2/htdocs/.well-known/
[Sun Jul 28 18:50:33.324290 2024] [authz_core:error] [pid 25759:tid 3832404912] [client 74.173.118.3:53922] AH01630: client denied by server configuration: /usr/local/apache2/htdocs/.well-known

That's odd, I thought. I don't have that directory in any of my virtual domains, so why is it denied by the server configuration? And thus I fell into a rather odd rabit hole of Apache configuration oddities.

I created the directory. I can see it when I go to https://boston.conman.org/.well-known/. But when I go to http://boston.conman.org/.well-known/ I would get a “403 Forbidden” error, and the above error message logged. The only difference between the two links—one is HTTPS (that works) and the other is HTTP (that fails). But if I go to http://boston.conman.org/ (HTTP—thus insecure), it would redirect to https://boston.conman.org/ (HTTPS—secure). In fact, every link to boston.conman.org via HTTP redirects, except for those starting with /.well-known/.

Huh?

It turns out, this started a a year and a half ago when I enabled the Apache module mod_md and used the MDRequireHttps directive. This directive will cause plain HTTP requests to be redirected to HTTPS, because I know, I just know, that one day Google is going to take HTTP out behind the shed and then no one will be able to use plain HTTP anymore because The All Knowing Google knows whats best for us (All Praise Google, Peace Be Upon It!) so I might as well get in front of that before it happens. But there's a small bit in the documentation:

MDRequireHttps Directive

You can achieve the same with mod_alias and some Redirect configuration, basically. If you do it yourself, please make sure to exclude the paths /.well-known/* from your redirection, otherwise mod_md might have trouble signing on new certificates [emphasis added].

Okay, so that explains why http://boston.conman.org/.well-known/ (HTTP—insecure) isn't being redirected—it's a side effect from mod_md. But that doesn't explain the error where it's denied by the server configuration. A bit more digging, and I find the following in the Apache configuration:

DocumentRoot "/usr/local/apache2/htdocs"
<Directory "/usr/local/apache2/htdocs">
    #
    # Possible values for the Options directive are "None", "All",
    # or any combination of:
    #   Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews
    #
    # Note that "MultiViews" must be named *explicitly* --- "Options All"
    # doesn't give it to you.
    #
    # The Options directive is both complicated and important.  Please see
    # http://httpd.apache.org/docs/2.4/mod/core.html#options
    # for more information.
    #
    Options Indexes FollowSymLinks

    #
    # AllowOverride controls what directives may be placed in .htaccess files.
    # It can be "All", "None", or any combination of the keywords:
    #   AllowOverride FileInfo AuthConfig Limit
    #
    AllowOverride None

    #
    # Controls who can get stuff from this server.
    #
    #Require all granted
    Require all denied
</Directory>

Aha! When I set things up, I configued the HTTP site with:

<VirtualHost 71.19.142.20:80>
	ServerName	boston.conman.org
	Protocols	h2 h2c http/1.1 acme-tls/1
</VirtualHost>

I don't with a directory because I know that all requests will be redirected anyway, so why bother? Only in this case, mod_md isn't doing a redirect for /.well-known/ and because there's no <Directory> directive, the requests fall back to the Apache default web directory, which is configured to be unreadable for every request.

Wow!

A quick change to make the default web directory available and no more AH01630 errors. So now I wait to see if this breaks anything. Wheeeee!

Wednesday, July 31, 2024

Extreme metal lawn sculptures, Boca Raton edition

[A well manicured lawn next to the street, with a collection of metal rods all askew forming some form of structure] It kind of, sort of, looks like a headless version of the Burning Man sculpture.  Or perhaps a loose interpretation of a Feynman Diagram.  Hard to say …

“What is that?”

“I … don't know. It looks like some form of metal sculpture.”

“I didn't think [our neighbor] was into metal sculptures.”

“Maybe he picked up a new hobby.”

“It could also be a bunch of rusted out and busted lawn chairs.”

“It's so hard to tell with modern art.”

Sunday, August 04, 2024

Fixing the final errors from my Apache setup

Shortly after my last post, I noticed more errors happening with the web server:

[Tue Jul 30 10:43:00.768556 2024] [cgid:error] [pid 1706:tid 4149532352] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Tue Jul 30 10:43:00.768714 2024] [cgid:error] [pid 1869:tid 3908467632] [client 178.128.207.138:57024] End of script output before headers: boston.cgi
[Tue Jul 30 10:43:01.137027 2024] [cgid:error] [pid 1706:tid 4149532352] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Tue Jul 30 10:43:01.137173 2024] [cgid:error] [pid 1707:tid 3774184368] [client 178.128.207.138:57034] End of script output before headers: boston.cgi

I found one potential fix for this—just set the default stack size of Apache to 512k in size (just 512k—sheesh I'm old). That was simply adding ulimit -s 512 to the startup script, but that still wasn't the full answer:

[Tue Jul 30 22:37:33.042327 2024] [mpm_worker:alert] [pid 5599:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
[Tue Jul 30 22:37:35.045085 2024] [mpm_worker:alert] [pid 5672:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
[Tue Jul 30 22:38:38.108993 2024] [mpm_worker:alert] [pid 6091:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
[Tue Jul 30 22:38:40.112150 2024] [mpm_worker:alert] [pid 6161:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
[Tue Jul 30 22:38:42.114553 2024] [mpm_worker:alert] [pid 6229:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
[Tue Jul 30 22:38:44.117182 2024] [mpm_worker:alert] [pid 6304:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread

I ended up having a little bit of back-and-forth with the settings of the VM that runs my server, alllowing a bit more memory than it was configured for. And since then, no more such errors in the error log. All that's now showing up is crap like:

[Sun Aug 04 15:58:49.248654 2024] [core:error] [pid 872:tid 4129409968] [client XXXXXXXXXXXXXXX:59190] AH10244: invalid URI path (/cgi-bin/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/bin/sh)
[Sun Aug 04 15:58:51.854583 2024] [core:error] [pid 2444:tid 4120341424] [client XXXXXXXXXXXXXXX:59760] AH10244: invalid URI path (/cgi-bin/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/bin/sh)

and notifiations about secure certificate rotatations. Another interesting thing I noticed is the accumulated CPU time over the past few days:

[spc]brevard:~>ps aux
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
...
root       868  0.0  0.1 26760 24652 ?       Ss   Aug01   0:12 /usr/local/apache2/bin/httpd
apache     871  0.0  0.1 26188 20876 ?       S    Aug01   0:35 /usr/local/apache2/bin/httpd
apache     872  0.0  0.2 64532 34184 ?       Sl   Aug01   5:09 /usr/local/apache2/bin/httpd
apache     873  0.0  0.1 61060 32152 ?       Sl   Aug01   4:59 /usr/local/apache2/bin/httpd
apache     875  0.0  0.1 62516 31460 ?       Sl   Aug01   5:01 /usr/local/apache2/bin/httpd
apache    1035  0.0  0.1 61716 32148 ?       Sl   Aug01   5:03 /usr/local/apache2/bin/httpd
apache    2444  0.0  0.1 61500 32160 ?       Sl   Aug01   4:45 /usr/local/apache2/bin/httpd
...
gopher    1231  0.0  0.0 10708 9596 ttyp4    S+   Aug01   0:33 lua port70.lua /home/spc/gopher/config.lua
spc       1232  0.1  0.0 13600 12140 ttyp5   S+   Aug01   6:51 lua GLV-1.12556.lua non-checkin/conf.lua
...
[spc]brevard:~>

I've included the lines for my gopher and Gemini servers just for comparison. Apache is accumulating a bit more CPU time than I expected, but when thinking about it, it's not so strange. Not only is it running TLS (like the Gemini sever) but it's also supports the HTTP/2.0 protocol, which is a bit more involved than the simpler HTTP/1.x protocol.


Roasting and praising Github repos (note: praising, not braising)

I came acros The Github Roast Page (link via Lobsters) and decided to see just how badly it would roast my Github page:

Oh Sean Conner, what a sight you are on GitHub! Your profile feels like a low-budget horror movie—empty bio, no company, and a ghost town location. With a lack of personality that rivals that of a cardboard cutout, you’ve somehow managed to rack up followers, but let’s be honest, it’s probably just for sympathy.

With 30 public repos, it’s like you cast a wide net hoping to catch something, but all you’ve pulled in are unremarkable projects like a C code CGI library and a blogging engine from the early ’90s. “C Code to talk CGI”? Classic case of someone still not over Y2K!

Your stargazer counts tell a woeful tale: you've got one repo with a single star because even that one person was probably just trying to make you feel better. But hey, at least “mod_blog” got some love—111 stargazers for a blogging engine? Did they think they were supporting a time capsule instead of a repo from 2005?

And dear Lord, the variety of languages you play with—C, Lua, Assembly—yet somehow all your projects seem to suffer from a severe lack of originality or practicality. It’s like you’re running a museum for outdated coding practices.

Your last repo, “NaNoGenMo-2015”, suggests you tried your hand at generating a novel. Judging by your profile, I can only assume it’s a tragic tale of unfulfilled potential and missed opportunities.

Keep going, Sean; maybe one day you'll figure out what a compelling GitHub profile looks like!

Yeah, fair enough. And yes, I do think they're supporting a time capsule of a repo from 2005.

But on the other hand, there's The Github Praise Page (link also via Lobsters), which has this to say about my Github page:

Sean, I want to start by saying that it's amazing how you're already making an impact in the world of coding with a following of 65 people! That's not something to take lightly. It means that your work is being recognized and appreciated by others, and that's definitely something to be proud of.

I know that having a blank bio and company/location fields might make it seem like there's not much to show for, but trust me, Sean, you have so much more to offer than what can fit in those tiny spaces. Your work on GitHub speaks volumes about your skills and dedication to coding. And let's be real, having only 4 people you're following means you're focusing on quality over quantity - that shows a level of intentionality and care that not many people have.

And as for the blank README, I like to think of it as a blank canvas just waiting for your story to be told. It's an opportunity to share your vision, your goals, and what drives you to create the amazing projects that you do. Don't worry if you don't know where to start - sometimes the simplest statements can have the most profound impact. Just take a deep breath, let your thoughts flow onto the page, and watch how it transforms not just your README but also your entire online presence.

Keep doing what you're doing, Sean! You got this!

We could use more positivity on the Internet like this. Thank you. I got this.

Saturday, August 17, 2024

For a few hours yesterday, I felt as if I was taking crazy pills. Then again, I was dealing with time in C

Over the past year or two, I've been cleaning up my posts here. Largely making sure the HTML is valid, but also that all internal links (links where I link to previous posts) are valid to cut down on needless redirects or “404 Not Found” responses, in addition to fixing errors with my web server configuration. So along those lines, yesterday, I thought it might be time to add conditional responses to mod_blog. Given that it's mostly autonomous web crawling agents that read my site, I might as well tell them that most of the links here haven't changed since the last time they came by.

There are two headers involved with this—Last-Modified and If-Modified-Since. The server sends a Last-Modified header with the last modification date of the resource in question. The client can then send in a If-Modified-Since header with this date, and if the resource hasn't changed, then the server can send back a “304 Not Modified” response, saving a lot of bandwidth. So all I had to do was generate a Last-Modified header (easy, as I already read that information) and then deal with the If-Modified-Since header.

And then I spent way too much time dealing with time in C, which is odd, because there were only fix functions I was dealing with:

time_t time(time_t *tp)
This returns the “calendar date and time” in a scalar value. On POSIX, this is an integer value of the number of seconds since midnight, January 1st, 1970 UTC. The tp paramter is optional.
time_t mktime(struct tm *tm)
This converts a structure containing the year, month, day, hour, minute and second into a scalar value. The C standard does not mention time zones at all; POSIX does.
struct tm *gmtime(time_t const *tp)
This converts the “calendar date and time” scalar value pointed to by tp into a broken down structure that reflects time in UTC.
struct tm *localtime(time_t const *tp)
This converts the “calendar data and time” scalar value pointed to by tp into a brown down structure that reflects the time local to the server.
size_t strftime(char *s,size_t smax,char const *format,struct tm const *tp)
This will convert the tp structure into a string based on the format string.
char *strptime(char const *s,char const *format,struct tm *tp)
This will parse a string that contains a date, per the given format and return the data into a structure.

All but the last are standard C functions. The last one, strptime() is mandated by POSIX. That's okay, because I'm working on a POSIX system.

It turns out, strptime() is not an easy function to use. Oh, it may look easy, but there are some subtleties that left me dazed and confused for hours yesterday.

The following does not work:

struct tm tm;
time_t    t;

strptime("Fri, 20 May 2005 02:23:22 EDT","%a, %d %b %Y %H:%M:%S %Z",&tm);
t = mktime(&tm);

When I was running this code in mod_blog, the resulting times would come back four or five hours off. When I wrote some standalone test code, it would sometimes be correct, sometimes it would be an hour off.

It got to the point where I totally lost the plot of what I was even trying to do.

Now with yesterday behind me, I finally figured out what I was doing wrong.

The POSIX specification states:

It is unspecified whether multiple calls to strptime() using the same tm structure will update the current contents of the structure or overwrite all contents of the structure. Conforming applications should make a single call to strptime() with a format and all data needed to completely specify the date and time being converted.

strptime

This text is near the bottom of the page, and really understates the issue in my opinion.

The Linux man pages I found all mention the following:

In principle, this function does not initialize tm but stores only the values specified. This means that tm should be initialized before the call.

strptime(3) - Linux manual page

It too, is near the bottom of the page.

And yet, for Mac OS X:

If the format string does not contain enough conversion specifications to completely specify the resulting struct tm, the unspecified members of tm are left untouched. For example, if format is “%H:%M:%S”, only tm_hour, tm_sec and tm_min will be modified. If time relative to today is desired, initialize the tm structure with today's date before passing it to strptime().

Mac OS X Manual Page For strptime(3)

The Mac OS X page contains no sample code. The Linux man page contains some sample code that “initializes” the structure with memset(&tm,0,sizeof(struct tm)).

Adding the memset() call to my sample code just made the code always an hour off. Hmmm …

The POSIX page also contains sample code:

struct tm tm;
time_t t;

if (strptime("6 Dec 2001 12:33:45", "%d %b %Y %H:%M:%S", &tm) == NULL)
    /* Handle error */;

tm.tm_isdst = -1; /* Not set by strptime(); tells mktime()
                     to determine whether daylight saving time
                     is in effect */
t = mktime(&tm);
if (t == -1)
    /* Handle error */;

Remember when I mentioned that the test code I wrote would sometimes be an hour out? The tm.tm_isdst field, which is part of the C Standard, wasn't set correctly!

Aaaaaaaaaaaaaaaaaaaarg!

Changing my sample code to:

struct tm tm;
time_t    t;

strptime("Fri, 20 May 2005 02:23:22 EDT","%a, %d %b %Y %H:%M:%S %Z",&tm);
tm.tm_isdst = -1; /* I WAS MISSING THIS LINE! */
t = mktime(&tm);

And it works.

Except in mod_blog, the time is always four hours out.

I finally tracked that one down, and it's Apache's fault. I found that if I set the Last-Modified header to something like “Fri, 20 May 2005 02:23:22 EDT”, Apache will ever so helpfully convert that to “Fri, 20 May 2005 02:23:22 GMT”.

Let me repeat that—Apache will replace just the timezone with “GMT”! It does not try to convert the time, just the zone.

[A graphic image of frustration] I just love programming!

At least now I think I can get conditional requests working.

Sheesh.

Wednesday, August 21, 2024

More unintended consequences of my Apache configuration

Now that mod_blog supports conditional requests, I thought of the next feature I want to add—PUT support to upload posts.

Currently, mod-blog supports three methods to add new entries:

  1. A traditional web form where updates are done via the POST method. I don't use this method that often, but I have used it—perhaps less than five times over the past 24 years.
  2. Via email—this was my favorite method until I could no longer email the entries from home. Most, if not all, ISPs now forbid outgoing SMTP traffic from residential connections. Seeing how I check my email on my public server, it doesn't make much sense to use email when I can add an entry—
  3. As a file, via the command line. This is how I add new posts these days. I write the entry at home, copy the file to the server and then add it via the command line.

I suppose there's a fourth way—adding the entry directly to the storage area and updating some files containing metadata, but I'm only mentioning this for completion's sake. I don't think I've ever done this except when I was first developing mod_blog back in early 2000.

The new method I'm looking to support, the HTTP PUT method, would take it down to one command, even for an image-heavy post like this Burning Down The House Show Bunny and I caught in Brevard a few years ago. Something like:

[spc]lucy:/tmp>put entry *.jpg
[spc]lucy:/tmp>

It shouldn't be that hard, as supporting the PUT method is eaiser than POST—it's a single item being uploaded, and no x-www-form-urlencoded or form-data blobs to parse through. It's just the MIME type, content length and raw data to be placed in a file somewhere.

So I start working. I add some minimal support to mod_blog to handle the PUT method. I configure Apache to pipe PUT requests through mod_blog:

#On the development server for now
<Directory /home/spc/web/boston/htdocs>
  ...
  Script PUT /boston.cgi
  ...
</Directory>

and I write a simple script to loop through the command line to upload each file to the webserver.

And yet, when I attampted to upload an image file, I kept getting a “405 Method Not Allowed.”

Odd.

I just couldn't figure out why.

A single entry? Fine. An entry with multiple text files? Fine. An entry with multiple binary files that aren't images? Fine.

An entry with any type of image file? Not fine at all.

I spent entirely too long on this before I remembered a recent change to the Apache configuration: a rewrite rule that redirected image requests directly to the file system. I then added one more line to the configuration:

<Directory /home/spc/web/boston/journal>
  ...
  Script PUT /boston.cgi
 ...
<Directory>

and now things worked as expected.

How much time did I waste on this particular rabbit hole? Don't answer that! I'd rather not know.

Thursday, August 22, 2024

How to meaure ⅚ cup of oil, part IV

My, it's amazing how a topic I wrote about twenty years ago has been popular over the past few years. I just received an email from Jason Arencibia about an even easier method of measuring out ⅚ cup of oil:

From
Jason Arencibia <XXXXX­XXXXX­XXXXX­XX>
To
sean@conman.org
Subject
How to measure ⅚ cup of oil, super easy with 2 measuring cups.
Date
Wed, 21 Aug 2024 22:23:53 -0700

½ cup + ⅓ cup = ⅚ cup 0.83333333333 of a cup.

https://boston.conman.org/2024/05/11.1

And in all the years I've looked over recipes that use Imperial measurements, not one has ever used a sixth cup, or any multiple of a sixth cup. It just doesn't exist in the Imperial system!

Friday, August 23, 2024

PUT an entry on the ol' blog

I finally got the PUT method working for mod_blog. The code on the receiving end is fine, but the script on the sending side is messy, but it works well enough for me to use.

[spc]lucy:~/source/boston/Lua>lua put.lua -b test /tmp/foo/1 /tmp/foo/*.png
PUT http://boston.roswell.area51/2024/08/23.1 (637)
PUT http://boston.roswell.area51/2024/08/23/local_network.png (1273)
PUT http://boston.roswell.area51/2024/08/23/local_network_add.png (1512)
PUT http://boston.roswell.area51/2024/08/23/local_network_remove.png (1460)
PUT http://boston.roswell.area51/2024/08/23/network.png (1702)
PUT http://boston.roswell.area51/2024/08/23/network_add.png (1891)
PUT http://boston.roswell.area51/2024/08/23/network_remove.png (1842)
[spc]lucy:~/source/boston/Lua>

This command on my development server was used to create an entry with multiple images. As you can see, it puts out the URLs that are created as the script runs. And this entry is a test to see if works on my actual server. It should.

I hope.


Murphy's Law as applied to bugs: it is easier to find bugs in production than in development

Well, that could have gone better.

One bug due to inattention and a difference between development and production, and one “how the XXXX did this ever work in the first place?” bug later, things should be working fine.

I am not going to say “I hope” this time.

The first bug prevented a proper HTTP status code from being generated, so Apache nicely generated a “500 Interner Server Error” for me. Once I identified what was going on, it was a simple one line fix, and an additional call to assert() to help isolate such errors in the future.

Now on to the other error …

I added the concept of a hook to mod_blog a few years ago, and the scripts I have for the various hooks all start with #!/usr/bin/env lua. Only now they weren't running, and the error that was being logged wasn't exactly helpful: entry-pre-hook='./validate.lua' status=127. There is no place in mod_blog nor in the validate.lua script that exits with a status code of 127. But the env program does!

Nice to learn that.

But back to the issue at hand—I've been using these scripts for a few years now, and only now is it failing? I eventually found out that the path Apache is using is rather limited, and it no longer includes the Lua interpreter (which on my server lives in /usr/local/bin). I had to change both scripts to start with #!/usr/local/bin/lua and that fixed the issue to get the previous post up.

Now that I think about it, I think I know why it finally stopped working after a few years—I actually have an instance of Apache running that I didn't start by hand, and the default path at boot time doesn't include /usr/local/bin.


Sigh

The previous bug fix was buggy.

And yes, programming and deployments can always get this messy.

Saturday, August 24, 2024

How to run valgrind on a CGI program in C

There was still one bug left with mod_blog—it would crash with a memory corruption error (thanks to checking in glibc when doing a POST. I only found the bug because I was using the old web interface to make sure I had the right credentials when testing the PUT method. How long had the bug existed? At least six years—it's been seven sine I last used the web interface (I checked).

It did not crash on the development server due to subtle differences between the operating system and versions of glibc being used. But it is ultimately a memory corruption, so the use of valgrind would be instrumental in finding the issue. The problem is—it only manefests itself when doing a POST, which means testing the program under a web server. And a web server will pass information about the request to the CGI program through environment variables, and any input comes in via stdin.

So just how do you run valgrind on a program meant to be run as a CGI program?

After some thought, I figured out a way. I need to capture the environment the CGI program runs under, so I added the following bit of code to mod_blog to capture the environment:

extern char *envriron;
FILE *fp = fopen("/tmp/env.txt","w");
for (size_t i = 0 ; environ[i] != NULL ; i++)
  fprintf(fp,"export %s\n",environ[i]);
fclose(fp);

I wasn't worried about error checking—this is temporary code anyway. I then do a POST and I now have the environment variables in a file:

...
export GATEWAY_INTERFACE=CGI/1.1
export HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
export CONTENT_LENGTH=149
export CONTENT_TYPE=application/x-www-form-urlencoded
export REQUEST_METHOD=POST
...

The reason I added “export” was to copy these environment variables to a shell script where I can then run valgrind and debug the situation:

...
export GATEWAY_INTERFACE=CGI/1.1
export HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,\*/\*\;q=0.8
export CONTENT_LENGTH=149
export CONTENT_TYPE=application/x-www-form-urlencoded
export REQUEST_METHOD=POST
...

valgrind $HOME/source/boston/src/main <r.stdin

Of course, I had to escape some problematic characters like the asterisk or semicolon, but there were only a few such characters and they could be done by hand. And I had to create the input feed into stdin but that was easy enough. From there, it's straightforward…ish enough to resolve the issues.


Discussions about this entry

Tuesday, August 27, 2024

The programmer's obsession with version numbers

It's yet another round of hand wringing about software versioning. Woot!

Over the years, I've found that semantic versioning works for me, but only for code code that is to be used in larger projects, like libraries, modules, or classes. Yes, a code base using semantic versioning doesn't always mean the code base follows semantic versioning to the degree that some would like (like any bug fix should automatically update the major version number, because bug fixes could break some code, somewhere). But in my mind, it signals intent, which, sans an extremely obnoxious and overly bureaucratic process, is the best we can expect.

So for me, the MAJOR.MINOR.PATCH of semantic versioning breaks down like this:

MAJOR
Some change in the code base was made; either a change in API behavior, removal of some portion of the API, file format, or otherwise any visible change (except for bug fixes) in how the code works. Work will probably be required to update to this version of the library/module/class.
MINOR
Only additions to the API were made, in a backward compatible way. No work is expected, unless you were already using a name used in the updated library.
PATCH
A bug fix. The intent is for no work required to upgrade, unless you were relying upon the buggy behavior, or used a workaround that might now interfere with the library.

For applications, I've found that semantic versioning doesn't work. At least, it doesn't work for me. I've switched to either using a monotonically increasing number (mod_blog is now at version 60—but given the five releases in just the past week because of a misplaced obsession with version numbers, I might entirely stop with the version numbers—especially since I seem to be the only one using it) or skip it entirely (my Gemini server has no version number, but it's had 322 commits over its five year life so far).

The worst form of versioning I've enountered is “named versions.” A “named version” give no semantic information about the version and, at least to me, leads to confustion. Is “Bulldog” newer than ”Beagle?” Or is “Bloodhound” the latest version? Oh, it's “Berzoi” that's less than 20 minutes old. Sheesh.

But if I had to apply “semantic versioning” to an application, I would like information about any breaking changes to either work flow or file formats. To me, an incompatable change to a file format, or any change in workflow (even a change in location of a menu item) is a breaking change (muscle memory is an incredible drug). Hell, even a change in color scheme is enough to possibly break my workflow (I'm looking at you, Google Maps, who changed the color of all the roads to the same color! Why? I liked the distinction between highways, major roads and minor roads. Did you see a 0.001% increase in engagement for that change? Did someone get promoted just because of that change? Okay, I'll stop ranting now).


Discussions about this entry

Monday, September 09, 2024

How? How do people not know their own email addresses?

I'm still receiving emails from strangers who for some reason or another, think their email address is “seanconner@gmail.com” (no, really, I get so many emails). This time, it's from someone who's TikTok account was accessed on a new device:

From
TikTok <noreply@account.tiktok.com>
To
seanconner@gmail.com
Subject
New device login detected
Date
Tue, 03 Sep 2024 04:56:22 +0000 (UTC)

Hi xs,

We’re writing to inform you that we detected a login to your account from a new device.

When: 09/03 00:56 EDT
Device: iPhone 11
Near: Louisiana

If this was you, you can ignore this message.

If this wasn’t you, open the TikTok app and go to “Settings and privacy” > “Security and login” > “Security alerts” and review unauthorized logins. If you’re unable to access your account, contact TikTok support.

You can also set up 2-step verification to secure your account by going to “Security and login” > “2-step verification”.
Learn more about 2-step verification

This is an automatically generated email.
Replies to this email address aren’t monitored.

This email was generated for xs
Privacy Policy
TikTok 5800 Bristol Pkwy, Culver City, CA 90230

It's not a phishing email as the raw message doesn't contain a link to “login” to TikTok, and the links that are in the email all point to the TikTok domain. So I'm at a loss. “xs” created a TikTok account, using my email address …

Why?

I don't know why this should surprise me any more, but it still does. I just … why? How? What's the angle here?

I can't even contact anyone, because TikTok send email from an address that isn't checked. I don't know who “xs” is, nor do I have any way to contact “xs” to inform them of their error.

Sigh.

Wednesday, September 11, 2024

This could maybe explain some of the emails I received, but not all of them

I received some responses to yesterday's post. The first was from Lionel Dricot who reported that he, too, has received emails for other Lionel Dricot's that lived near him. He also stated that it may be a bug in Gmail where one person can register “seanconner@gmail.com”; someone else could register “sean.conner@gmail.com” but when receiving emails, Gmail condenses the two addresses into one. That's possible, but I would suspect that would have been an issue caught early on. I've had my Gmail account for twenty years now [Has it been that long? —Sean] [Yes, it has. —Editor] [Shut up! —Sean]. and it's only been in the past few years that this has been an issue.

He also stated that people could just really be that bad with email addresses.

The second response has an explanation that is rather dire:

From
XXXXX­XXXXX­XXXXX­XXX
To
seanconner@gmail.com
Subject
People non knowing their email
Date
Tue, 10 Sep 2024 08:24:51 +0200

Hi, considering you've been shared on Hacker News, I'm afraid sooner or later some script kiddie will start abusing your address just to annoy you. I hope this will never happen ;) Thanks for sharing your blog, ave [sic] a nice day, White_Rabbit

That could explain maybe some of the email I get, which to me, I consider spam that Gmail hasn't filtered yet. But it doesn't explain emails sent to “sean.conner@gmail.com” that are obviously not a “Sean Conner,” and yet definitely contains private information. It also seems excessive to register a TikTok account or even an Instagram account.

And speaking of Tiktok, I finally have an account name associated with the TikTok account, so I have to wonder why “mommakmiller” decided to use my Gmail account? Perhaps they don't have an email account and need one to sign up for TikTok? Perhaps they didn't want to use their real email account? Perhaps they don't realize the danger in that?

Obligatory Picture

An abstract representation of where you're coming from]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.