Monday, January 01, 2024
Free at last! Free at last! Thank God almighty the mouse is free at last!
Well, at least the 1928 version of Micky Mouse is free and in the public domain. It's no coincidence that Disney has worked the past few years turning the 1928 version of Micky into a trademark. Grab the popcorn! It's going to be an interesting year in copyright law.
Unlike the past few years, this year came in pretty quiet. Yes, there were some fireworks off in the distance, but our neighbors? Very quiet this year. Maybe that's a good sign that this year will be less explosive.
I can hope.
HAPPY NEW YEAR!
Monday, January 08, 2024
I wonder why today, of all days, I'm feeling this level of melancholy?
Yesterday at the “every-other-week” D&D game (played via Zoom these days), I couldn't bring up the virtual map the DM uses because my web browser didn't support WebGL, despite my updating the entire operating system, including the web browser, just 10 minutes earlier.
Sigh. What the XXXX happend to being just 20 minutes out of date?
Even worse, it ran the last time we played a month ago.
Admid suggestions from the rest of the group about what to do (even with the suggestion to not use the map at all and just go for “theater of the mind”), I just gave up and bailed out of the game. I just couldn't cope with the fact that despite my attempts at keeping up to date with all this technological XXXX being forced down our throats, it still wasn't XXXXXXX good enough.
I think this is the culmination of my feelings toward the latest round of AI. I saw this video with Dr. Matt Welsh who joked that at his company, not using ChatGPT should be a firable offence, and I felt offended at the joke. Are we so XXXXXXX beholden to our tools that we give up mastery of them? Of course, as the owner of a software company, he loves the idea of ChatGPT to write his software—it keeps his costs down. XXXX people who spent time investing in writing software.
As I commented a month ago:
I wrote about this about fifteen years ago, about the fear of becoming beholden to technology without understanding. IT currently doesn’t incentivize deep learning, or even mastery of anything, both because the perceived need to move fast, and the ever changing technology landscape. Who has time to learn anything any more? Why even bother if it’s all going to change anyway? Especially when we have a program that can do all that pesky learning for us?
Edit to add: Maybe this post describes it better?
LLMs make Programming Language Learning Curves Shallower | Lobsters
Why bother indeed?
And to bring this back around to D&D—Deck of DM Things has made series of videos about using AI to run a game:
- The Future of D&D: ChatGPT as a Dungeon Master
- Experiment: Can AI RUN a Full D&D Session?
- Experiment: Can ChatGPT Run 5e Combat?
Even though he wasn't fully successful with the experient, as YouTuber Philip DeFranco is fond of saying, “this is the worst that AI will ever be.”
This has been in the back of my mind for some time now. I left The Enterprise partly because of the testing, and a fear I had was that not that AI would write the tests for us (which I wouldn't mind) but that I would have to write tests for the code AI wrote. It was bad enough having to endure endless computer updates and reboots when the XXXXXXX computer mandated it—but to be truly enslaved to the machine?
Who controls who?
And it all came crashing through when I couldn't load a XXXXXXX map on a XXXXXXX web page.
I don't like the direction the industry is going, with all the constant changes just for its own sake (yes, I know, changes for security is a real thing, but that's not carte blanche to change how the rest of the software works). And often times, it's the computer in change. “Oh! Update me now!” “Oh! Reboot me now!” “Oh! Feed me now, Seymour!”
XXXX that XXXX!
Oh great! The network connection is down. Could be Roko's Basilisk giving me a warning, or just a momentary glitch. Who knows?
Sigh.
I feel like I'm yelling at the clouds to get off my lawn as I adjust my onion on my belt. But on a lighter note, ChatGPT in French is “cat, I farted.” That is truly wonderful.
Thursday, January 18, 2024
“Now, here, you see, it takes all the running you can do, to keep in the same place.”
I'm a bit relunctant to write this, as I'll come across as an old man yelling at the clouds to get off his lawn, but the whole “update treadmill” the Computer Industry has foisted on us is getting tiresome.
Bunny now wears a CGM prescribed by her doctor. It's a small disk that adhears to the back of the upper arm and sends readings of blood sugar via Bluetooth. Bunny has an app on her smartphone that records the information and forwards it to her doctor. The app is fine; no real problems with using it.
Until this morning.
Bunny woke me up to inform me that the app just stopped working, because her smartphone hadn't been updated in the previous 20 minutes.
…
What the XXXX‽
It was working fine the previous night. What updates were required? And why drop support for older operating systems? Oh yeah … right … it's hard to support systems older than 20 minutes.
And some PM somewhere needed to justify their job.
Sigh.
Things are working fine, how that the operating system on her smart phone was updated (only took several hours). But still … gah!
“… and water is wet! Film at 11!”
Time for more yelling at the clouds.
A few days ago I was surfing on my iPad when I came across An Astronomy Club of Brevard, NC. Now, my iPad is a bit, shall we say, slightly out of date? So it was with sadness when I saw:
Your Browser Is
No Longer SupportedTo view this website and enjoy a better online experience,
update your browser for free.
Followed by a list of browsers.
You know, I thought we left that behind in the early 2000s, but apparently not.
I then viewed the site on my desktop computer and … why? Why do I need a less-than-20-minute old browser to view seven images and some text? Why does it take 193 requests to even show the page? At a minimum, you have the HTML, CSS and seven images, so … nine requests? Okay, maybe some Javascript to do the animations designers are so fond of.
But 193 files?
I'm not blaming the Astronomy Club of Brevard for this. They're using Wix to create and host the website, so I'm laying the blame solely at Wix here for going completely overboard with the JavaScript. Do web developers know you can create a perfectly good site with just HTML and CSS?
Yeah, I'm yelling at the tide to stop coming in.
Sigh.
Friday, January 19, 2024
Complicating code
I recently added an .OPT
directive to my 6809 assembler.
This allows me to add options to a source file instead of having to always specify them on the command line.
I originally did this to support unit testing and I feel it's a nice addition.
When the test backend is enable, all the memory of the emulated 6809 is marked as non-readable, non-writable, non-executable. As the code is assembled, memory used by instructions are switched to “readable, executable” and memory used by data becomes “readable, writable” (easy, because of an early decision to have separate functions to write instructions vs. data). If you reference memory outside of the addresses used by the source code being assembled, you have to specify the permissions of said addresses.
; The set up for our tests. ; We should only read from these locations .opt test prot r,ECB.beggrp,ECB.beggrp + 1 .opt test prot r,$112 ; We can read and write to these locations .opt test prot rw,$0E00,$0E00 + 1023 ; Set the stack for our tests .opt test stack $FF00 ; Initialize some VM memory .opt test memw,ECB.beggrp,$0E00 .opt test memb,$112,2
You really only need these for memory locations defined outside the source code. In the example above, the memory referenced is defined by the Color Computer and not by the code being tested, so they need to be initialized. And because the system only supports 65,536 bytes of memory, we can easily (on modern systems) assign permissions per byte.
So this all works and is great at finding bugs.
But then I thought—I can use .OPT
to supress warnings as well.
If I assemble the following code:
;************************************************************************** ; frame_buffer set frame buffer address (GPL3+) ;Entry: A - MSB of frame buffer ;Exit: D - trashed ;************************************************************************** frame_buffer ldb PIA0BC ; wait for vert. blank bpl frame_buffer .now stx ,--s ; save X ldx #SAM.F6 ; point to framebuffer address bits .setaddress clrb ; reset B lsla ; get next address bit rolb ; isolate it stb b,x ; and set the SAM F bit leax -2,x ; point to next F bit register cmpx #SAM.F0 ; more? bhs .setaddress ; more ... puls x,pc ; return
I get
frame_buffer.asm:9: warning: W0002: symbol 'frame_buffer.now' defined but not used
The subroutine has effectively two entry points—frame_buffer
will wait until the next vertical blanking interrupt before setting the address,
while frame_buffer.now
will set it immediately.
The former is good if you are using a double-buffer system for graphics,
while the later is fine for just switching the address once.
Given that my assembler will warn on unused labels,
this means I'll always get this error when including this code.
I can supress that warning by issuing a -nW0002
on the command line,
but then this will miss other unused labels that might indicate an actual issue.
I wanted to have something like this:
.opt * disable W0002 frame_buffer ... .now ... puls x,p .opt * enable W0002
We first disable the warning for the code fragment, then afterwards we enable it again. I coded this all up, but it never worked. It worked for other warnings, but not this particular one.
The assembler is a classic two-pass assembler, but not all warnings are issued during the passes, as can be seen here (using a file that generates every possible warning with debug output enabled):
[spc]lucy:~/source/asm/a09/misc>../a09 -ftest -d -o/dev/null warn.asm warn.asm: debug: Pass 1 warn.asm:2: warning: W0010: missing initial label warn.asm:10: warning: W0008: ext/tfr mixed sized registers warn.asm:11: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters warn.asm:16: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters warn.asm:21: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters warn.asm:23: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters warn.asm:36: warning: W0013: label 'a' could be mistaken for register in index warn.asm:37: warning: W0013: label 'b' could be mistaken for register in index warn.asm:38: warning: W0013: label 'd' could be mistaken for register in index warn.asm: debug: Pass 2 warn.asm:2: warning: W0003: 16-bit value truncated to 5 bits warn.asm:3: warning: W0004: 16-bit value truncated to 8 bits warn.asm:4: warning: W0005: address could be 8-bits, maybe use '<'? warn.asm:5: warning: W0006: offset could be 5-bits, maybe use '<<'? warn.asm:8: warning: W0005: address could be 8-bits, maybe use '<'? warn.asm:9: warning: W0007: offset could be 8-bits, maybe use '<'? warn.asm:11: warning: W0009: offset could be 8-bits, maybe use short branch? warn.asm:13: warning: W0011: 5-bit offset upped to 8 bits for indirect mode warn.asm:25: warning: W0012: branch to next location, maybe remove? warn.asm:26: warning: W0012: branch to next location, maybe remove? warn.asm:43: warning: W0017: cannot assign the stack address within .TEST directive warn.asm:42: debug: Running test test warn.asm:42: warning: W0014: possible self-modifying code warn.asm:42: warning: W0016: memory write of 00 to 0034 warn.asm:42: warning: W0015: : reading from non-readable memory: PC=0016 addr=F015 warn.asm:42: debug: Post assembly phases warn.asm:2: warning: W0002: symbol '.start' defined but not used [spc]lucy:~/source/asm/a09/misc>
You can see some are generated during pass 1,
some during pass 2.
The message “Running test test” happens after the second pass is done,
and the one I'm trying to supress,
W0002
,
at the very end of the program.
The .OPT
directives are processed during passes 1 and 2.
There's just no easy way to supress W0002
just for a portion of the code,
as I would have to carry forward that any labels defined between the disable and enable of W0002
should be exempt from the “no-label warning” check.
It's issues like these that complicate programs over time.
I was about to scrap the idea when I came up with a solution.
Each symbol in the symbol table has a reference count.
At the end of assembly,
there's code that goes through the symbol table and issues the warning if a label has a reference count of 0.
All I did was create another option,
.OPT * USES <label>
,
to increment the reference count of a label.
At the end of the day,
it works.
I'm not saying this is a “good” solution,
just “a” solution.
Wednesday, January 24, 2024
Well, that was weird
Bunny's pharmacy just called to remind us to call and remind them to renew Bunny's prescription. Um … okay?
Adventures in high dining
Bunny and I found oursevles at La Nouvelle Maison, an upscale eating establishment in downtown Boca Raton. I'm not sure why she picked the place—perhaps because our previous upscale dining establishment, Root Italian Kitchen is no longer open. Or perhaps it's “just because.”
In any case, we had a reservation for 8:00 pm. I made it yesterday, as Bunny was running about Chez Boca worrying about the dress code (turns out—“business casual” so I don't have to don my International Man of Mystery clothes). I received a text notification a few hours later, and a phone call today, asking to confirm our time.
We arrived at 8:00 pm after a slight detour through the valet parking lot of Trattoria Romana next door (the driving directions from Mr. Google were slightly unclear on that). We walked right in, stated our reservation name and time, and were immedately seated in the quite crowded restaurant. I found that quite surprising, given how busy it all looked.
One thing about high dining that I don't think I'll get used too—the level of attention to detail. I think our waiter only had three or four other tables to wait on, and when he wasn't busy, he was standing nearby, waiting at attention. And if we was busy, his assistant (Assistant!) was nearby waiting to serve us.
And boy, did we get served. A cheese platter of assertive goat, a more assertive bleu and a brie that gave butter a run for its money. We decided not to get any caviar, but we did get a Soupe à L’Oignon gratinee for me and Pâté de Foie Gras for Bunny. Those were followed by a Petite Laitue Roquefort for me, and a Salade de Betterave for Bunny.
And then the entrees. I ordered the Risotto de Saint-Jacques à la truffe et aux épinards while Bunny had the Côte de veau aux Champignons Forresterie. I think the most amazing thing about Bunny's dinner was the gratin potatoes. Our waiter explained how they were made—the potatoes were sliced paper thin (thin enough to see through) and then layered with gruyere cheese, pressed down with a brick for over 24 hours in the refriderator, then baked and just before serving, deep fried for about a minute. It was perfect cube of potato and cheese about 3″ (7.5cm) per side, made up of easily a hundred layers.
Desert was a chocolate soufflé. That was my first soufflé, and also my last. I'm not saying it was bad—it wasn't by any stretch of the imagination—it was delicious! But my issue with the soufflé was one of texture. I'm sensitive to some … textures … that some foods have, and this one triggered that sensitivity. The only reason I ate more than one spoonfull was just down to how good it was, and I was able to bull through my textural revulsions. In all my times eating at find dining establishments, I've found that even if I don't like a particular item, it will be the best particular item I've ever had.
Was everything we had good? It's French! Of course it was good. All of it. The service was incredible, and our only real complaint of the night was the noise level when we first arrived—it was a bit too high for our liking. And the price was about what we expected—it's the “once a year” level of price.
Friday, January 26, 2024
“It is just a barebones framework written as a love letter to the language of C.”
I'm used to having some of my posts make it to the Orange Site,
but I was not expecting to have one of my software projects
(in this case, mod_blog
)
make it to the Orange Site.
I'm finding it especially amusing to read this thread.
I did not expect mod_blog
to be a “love letter to C.”
Wednesday, January 31, 2024
The Repair Culture, Part II—Electric DeoxIToo
I've repaired my Logitech Trackman Marble a few times over the years, but this time the traditional fix didn't work. But I have seen enough Adrian Black computer repair videos to know that perhaps, a liberal application of DeoxIT Spray Contact Cleaner on the buttons might work. I ordered some last week, and when I went out to check the mail, I found it sitting on top of the mail box! I'm surprised it didn't fall off.
Anyway, a liberal amount of DeoxIT in the buttons and yup, that did the trick this time.
Sweet.
Dear LinkedIn, I don't think I'm the expert you want answering these questions
I cross post links to my blog on LinkedIn mainly to be contrarian, or at least think I'm providing something different than all the business related stuff posted there. But lately, I've seen notifications there like, “You're one of the few experts invited to add this collaborative article: …” and it's some article about AI, or Agile or the latest one, “Your team is falling behind on project deadlines. What web-based tools can help you catch up?”
Oh … nothing good can come from me answering these questions.
I know languages that have support for “read-only memory,” but what about “write-only memory?”
I'm still hacking away on my overengineered 6809 assembler and one feature I've beem mulling over is a form of static access checking. I have byte-level access control when running tests but I'm thinking that adding some form of “assemble time checking” would also be good. I've been writing code hitting the hardware of the Color Computer, and there are semantics around hardware that I don't think many languages (or even assembler) support—write only memory! As the Amiga Hardware Reference Manual states (only referenced here because it's a good example of what I'm talking about):
Registers are either read-only or write-only. Reading a write-only register will trash the register. Writing a read-only register will cause unexpected results.
…
When strobing any register which responds to either a read or a write, (for example copjmp2) be sure to use a
MOVE.W
, notCLR.W
. TheCLR
instruction causes a read and a clear (two access) on a 68000, but only a single access on 68020 processors. This will give different results on different processors.
The Color Computer isn't quite as finicky
(although the 6809 inside it also does the “read, then write” thing with the CLR
instruction),
but there is still memory-mapped IO that is read-only, some that is write-only,
and some that has different meanings when reading and writing.
And while C has some semantic support with volatile
(to ensure reads and writes happen when stated)
and const
(to ensure the read-only nature)
it still lacks a “write-only” concept.
And I've never used an assembler that had “write-only” semantics either.
I'm thinking something along these lines:
org $FF40 DSK.CTRL rmb/w 1 ; write only org $FF48 DSK.CMD rmb/w 1 ; write only org $FF48 DSK.STATUS rmb/r 1 ; read only DSK.TRACK rmb 1 ; these can be read and written DSK.SECTOR rmb 1 DSK.DATA rmb 1
Here, the RMB
directive just reserves a number of bytes, with a default access of “read-write.”
The /W
or /R
designates that label as being either “write-only” or “read-only.”
And if you look closely,
you'll see that both DSK.CMD
and DSK.STATUS
are defined as the same address.
It's just that writing to that address will send a command to the drive controller,
while reading from that address give the current status.
The only issue I have are hardware registers that can be programmed for input or output.
The MC6821 used in the Color Computer has such registers,
and the issue I have is how to signify this change in state in a program—that at this point in the program,
such-n-such address is “read-write” but afterwards, it's “write-only.”
Discussions about this entry
How much for the book? I don't think I paid that much for the computer
Now, about that Amiga Hardware Reference Manual link in my previous post—$577.99‽
Seriously‽
I know it's now a niche market, but are there really people buying that book for almost $600?
Good Lord!
Friday, February 02, 2024
Making it to the Orange Site
A previous post made it to Lobster only to later show up at the Orange Site. How about that?
Making it to the Orange Site
A previous post made it to Lobster only to later show up at the Orange Site. How about that?
Making it to the Orange Site
A previous post made it to Lobster only to later show up at the Orange Site. How about that?
It should be obvious by now what day it is, but just in case
I want to ensure everybody that yes, I intentionally made the previous three posts. And while they are similar, they are not in fact, identical. You know, much like Ground Hog Day.
Monday, February 05, 2024
The difficulties in supporting “write-only memory” in assembly
When I last wrote about this, I had one outstanding problem with static analysis of read-only/write-only memory, and that was with hardware that could be input or output only. It was only after I wrote that that I realized the solution—it's the same as a hardware register having different semantics on read vs. write—just define two labels with the semantics I want. So for the MC6821, I could have:
org $FF00 PIA0.A rmb/r 1 ; read only org $FF00 PIA0.Adir rmb/w 1 ; write only, to set the direction of each IO pin PIA0.Acontrol rmb 1 ; control for port A
So that was a non-issue. It was then I started looking over some existing code I had to see how it might look. I didn't want to just jump into an implementation without some forethought, and I quickly found some issues with the idea by looking at my maze generation program. The code in question initializes the required video mode (in this case 64×64 with four colors). Step one involves writing a particular value to the MC6821:
lda #G1C.PIA ; 64x64x4 sta PIA1.B
So far so good. I can mark PIA1.B
as write-only
(technically, it also has some input pins so I really can't,
but in theory I could).
Now, the next bit requires some explaining. There's another 3-bit value that needs to be configured on the MC6883, but it's not as simple as writing the 3-bit value to a hardware register—each bit requires writing to a different address, and worse—it's a different address if the bit is 0 or 1. So that's six different addresses required. It's not horrible though—the addresses are sequential:
bit | 0/1 | address |
---|---|---|
V0 | 0 | $FFC0 |
V0 | 1 | $FFC1 |
V1 | 0 | $FFC2 |
V1 | 1 | $FFC3 |
V2 | 0 | $FFC4 |
V2 | 1 | $FFC5 |
Yeah,
to a software programmer,
hardware can be weird.
To set bit 0 to 0,
you do a write
(and it does not matter what the value is)
to address $FFC0
.
If bit 0 is 1,
then it's a write to $FFC1
.
So with that in mind,
I have:
sta SAM.V0 + (G1C.V & 1<<0 <> 0) sta SAM.V1 + (G1C.V & 1<<1 <> 0) sta SAM.V2 + (G1C.V & 1<<2 <> 0)
OOh. Yeah.
I wrote it this way so I wouldn't have to look up the appropriate value and write the more opaque (to me):
sta $FFC1 sta SFFC2 sta $FFC4
The expression (G1C.V & 1<<n <> 0)
checks bit n to see if it's set or not,
and returns 0 (for not set) or 1 (for set).
This is then added to the base address for bit n,
and it all works out fine. I can change the code for,
say, the 128×192 four color mode by using a different constant:
lda #G6C.PIA sta PIA1.B sta SAM.V0 + (G6C.V & 1<<0 <> 0) sta SAM.V1 + (G6C.V & 1<<1 <> 0) sta SAM.V2 + (G6C.V & 1<<2 <> 0)
But I digress.
This is a bit harder to support. The address being written is part of an expression, and only the label (defining the address) would have the read/write attribute associated with it. At least, that was my intent. I suppose I could track the read/write attribute by address, which would solve this particular segment of code.
And the final bit of code to set the address of the video screen (or frame buffer):
ldx #SAM.F6 ; point to frame buffer address bits lda ECB.grpram ; get MSB of frame buffer mapframebuf clrb lsla rolb sta b,x ; next bit of address leax -2,x cmpx #SAM.F0 bhs mapframebuf
Like the VDG Address Mode bits,
the bits for the VDG Address Offset have unique addresses,
and because the VDG Address Offset has seven bits,
the address is aligned to a 512 byte boundary.
Here,
the code loads the X
register with the address of the upper end of the VDG Address Offset,
and the seven top most bits of the video address is sent,
one at a time,
to the B
register,
which is used as an offset to the X
register to set the appropriate address for the appropriate bit.
So now I would have to track the read/write attributes via the index registers as well.
That is not so easy.
I mean, here, it could work, as the code is all in one place, but what if instead it was:
ldx #SAM.F6 lda ECB.grpram jsr mapframebuf
Or an even worse example:
costmessage fcc/r "A constant message" ; read only text buffer rmb 18 ldx #constmessage ldy #buffer lda #18 jsr memcpy
The subroutine memcpy
might not even be in the same source unit,
so how would the read/write attribute even be checked?
This is for static analysis,
not runtime.
I have one variation on the maze generation program that generates multiple mazes at the same time, on the same screen (it's fun to watch) and as such, I have the data required for each “maze generator” stored in a structure:
explorec equ 0 ; read-only backtrackc equ 1 ; read-only xmin equ 2 ; read-only ymin equ 3 ; read-only xstart equ 4 ; read-only ystart equ 5 ; read-only xmax equ 6 ; read-only ymax equ 7 ; read-only xpos equ 8 ; read-write ypos equ 9 ; read-write color equ 10 ; read-write func equ 11 ; read-write
This is from the source code, but I've commented each “field” as being “read-only” or “read-write.” That's another aspect of this that I didn't consider:
lda explorec,x ; this is okay sta explorec,x ; this is NOT okay
Not only would I have to track read/write attributes for addresses, but for field accesses to a structure as well. I'm not saying this is impossible, it's just going to take way more thought than I thought. I don't think I'll have this feature done any time soon …
Tuesday, February 06, 2024
So you want to amplify my SEO
- From
- Krystal XXXXXXX <XXXXXXXXXXXXXXXXXXXX@gmail.com>
- To
- sean@conman.org
- Subject
- Amplify Your SEO with Strategic Link Inserts
- Date
- Wed, 7 Feb 2024 01:16:35 +0300
Hi Content Team,
It’s Krystal XXXXXXX here from Next Publisher, your next potential partner in digital storytelling. We're thrilled about the idea of featuring both guest posts and link insertions on your dynamic website.
We would like to know your fee structure for hosting guest posts and link insertions. Our team aims to create compelling content that is tailored to your site’s audience and enhances your overall content strategy.
A quick note: this initial email is only for starting our dialogue. All more detailed communications, including agreements and transactions, will be carried out through our official Next Publisher email.
We admire the quality of your platform and are excited to explore how we can work together for mutual benefit.
Looking forward to your prompt reply.
Warm wishes,
Krystal XXXXXXX
Hello Krystal.
Since you neglected to include a link
(in an email sent as HTML no less!)
it's hard for me judge the value Next Publisher will provide for my site,
so I'm going to have to adjust my prices accordingly.
My fee for both guest posts and link insertions is $10,000 (US) per.
So if a guest post also includes a link insertion,
that would be a total of $20,000 (US).
If I'm going to be whoring selling out what Google Page Rank I have,
it's going to cost.
I look forward to hearing back from you.
Sean.
Okay! I'll answer your question, LinkedIn. Also, Orange Site! Orange Site! Orange Site!
Today's “you're one of the few experts invited to add this collaborative article” from LinkedIn is “How do you become a senior application developer?” My answer? Stay in a programming job for three years. Boom! You're a senior application developer. I know, I know, that's a big ask these days when everybody is jumping ship every two years. But hey, if you want to be a “senior application developer,” you got to make sacrifices.
Oh, and the title to today's post? I found out that LinkedIn really liked when I mentioned the Orange Site in the title of my post. Almost two orders of magnitude more. So I'm doing a test to see if I can game the system there.
Wednesday, February 07, 2024
Instead of “write-only memory” assembly support, how about floating point support?
You might think it odd to add support for floating point constants for an 8-bit CPU, but Motorola did development on the MC6839 floating point firmware for the MC6809, an 8K ROM of thread-safe, position-independent 6809 code that implements the IEEE Standard for Floating-Point Arithmetic. It was never formally released by Motorola as a product, but from what I understand, it was released later under a public domain license. At the very least, it's quite easy to MC6839 find both the ROM image and the source code on the Intarwebs. So that's one reason.
Another reason is that the Color Computer BASIC supports floating point operations, and while not IEEE-754, as it was written before the IEEE-754 standard become a standard, it still floating point, and there are only minor differences between it and the current standard, namely the exponent bias, number of fractional bits supported, and where the sign bit is stored. It really comes down to some bit manipulations to massage a standard float into the Color Computer BASIC float format. There are some differences, but the differences are small (literally, on the scale of 0.0000003) probably due to parsing differences, and small enough that it should be “good enough.” Especially since the Color Computer BASIC float format doesn't support infinity or NaN.
So if you specify a backend other than the rsdos
backend,
you get IEEE-754,
and if you do specify rsdos
as a backend,
you get the Color Computer BASIC float format.
And yes,
I added support for floating point expressions
(but not for the test backend—I'm still thinking on how to support it),
and one interesting feature I added is the factorial operator “!”.
Factorials are used in Talor series,
which the Color Computer BASIC uses for the sin()
function,
so I can literally write:
; Oh! '**' is exponentiation by the way! taylor_series .float -((2 * 3.14159265358979323846) ** 11) / 11! .float ((2 * 3.14159265358979323846) ** 9) / 9! .float -((2 * 3.14159265358979323846) ** 7) / 7! .float ((2 * 3.14159265358979323846) ** 5) / 5! .float -((2 * 3.14159265358979323846) ** 3) / 3! .float 2 * 3.14159265358979323846
and have it generate the correct values. I personally don't know of any language that has a factorial operator (maybe APL? I don't know).
I think I'm having more fun writing the assembler than I am writing assembly code.
Sunday, February 11, 2024
An extensible programming language
A few days ago I wrote about adding a factorial operator to my assembler, and I noted that I knew not of any other languages that had such a feature. So imagine my surprise as I'm reading about XL (via Lobsters) and the second example is factorial! Not only that, but that was an example of extending the language itself! The last time I was this excited about software was reading about Synthesis OS, a JIT-based operating system where you could create your own system calls.
How it handles precedence is interesting. In my assembler, I have left or right associativity as an explicit field, whereas in XL, it's encoded in the precedence level itself—even if its left, odd if its right. I'm not sure how I feel about that. On the one hand it feels nice and it's one less field to carry around; on the other, being explicit as I did makes it clear if something is left or right. But on the gripping hand, it sounds like matching precedence on a left and right operator could lead to problems, so I still may have an explicitness problem.
But I digress.
It's a very simple language with only one keyword “is” and a user-definable precedence table. The parser generates a parse tree of only eight types, four leaf nodes (integer, real, text, name (or symbol)) and four non-leaf nodes (prefix, infix, postfix and block). And from there, you get XL.
This is something I definitely want to look into.
Wednesday, February 14, 2024
Notes from an overheard conversation from a car attempting a right turn
“Oh! Now what?”
“Sir, see the lit sign up there? You cannot turn right.”
“But did you not see the car right in front of me turning right?”
“Sir, if a person jumped off a bridge, would you follow?”
“Yes.”
“You must be very smart then.”
“And selective enforcement of the laws leads to distrust of the police.”
“Sir—”
“Oh look! The ‘No Right Turn’ sign is off now! Gotta go! Bye!”
“I don't think it's wise to taunt the poice like that.”
“Down with the Man! Power to the people! Yo!”
Wednesday, February 28, 2024
Converting IEEE-754 floating point to Color BASIC floating point
I'm still playing around with floating point on the 6809—specifically, support for floating point for the Color Computer. The format for floating point for Color BASIC (written by Microsoft) predates the IEEE-754 Floating Point Standard by a few years and thus, isn't quite compatible. It's close, though. It's defined as an 8-bit exponent, biased by 129, a single sign bit (after the exponent) and 31 bits for the mantissa (the leading one assumed). It also does not support ±∞ nor NaN. This differs from the IEEE-754 single precision that uses a single sign bit, an 8-bit exponent biased by 127 and 23 bits for the mantissa (which also assumes a leafing one) and support for infinities and NaN. The IEEE-754 double precision uses a single sign bit, an 11-bit exponent biased by 1023 and 52 bit for the mantissa (leading one assumed) plus support for infinities and NaN.
So the Color BASIC is about halfway between single precision and double precision. This lead me to use IEEE-754 double precision for the Color Computer backend (generating an error for inifinities and NaN) then massaging the resulting double into the proper format. I double checked this by finding some floating point constants in the Color BASIC ROM as shown in the book Color BASIC Unravelled II, (available on the Computer Computer Archives), like this table:
4634 * MODIFIED TAYLOR SERIES SIN COEFFICIENTS 4635 BFC7 05 LBFC7 FCB 6-1 SIX COEFFICIENTS 4636 BFC8 84 E6 1A 2D 1B LBFC8 FCB $84,$E6,$1A,$2D,$1B * -((2*PI)**11)/11! 4637 BFCD 85 28 07 FB F8 LBFCD FCB $86,$28,$07,$FB,$F8 * ((2*PI)**9)/9! 4638 BFD2 87 99 68 89 01 LBFD2 FCB $87,$99,$68,$89,$01 * -((2*PI)**7)/7! 4639 BFD7 87 23 35 DF E1 LBFD7 FCB $87,$23,$35,$DF,$E1 * ((2*PI)**5)/5! 4640 BFDC 86 A5 5D E7 28 LBFDC FCB $86,$A5,$5D,$E7,$28 * -((2*PI)**3)/3! 4641 BFE1 83 49 0F DA A2 LBFE1 FCB $83,$49,$0F,$DA,$A2 * 2*PI
Then using the byte values to populate a variable and printing it inside BASIC (this is the expression -2π3/3!):
X=0 ' CREATE A VARIABLE Y=VARPTR(X) ' GET ITS ADDRESS POKE Y,&H86 ' AND SET ITS VALUE POKE Y+1,&HA5 ' THE HARD WAY POKE Y+2,&H5D POKE Y+3,&HE7 POKE Y+4,&H28 PRINT X ' LET'S SEE WHAT IT IS -41.3417023
Then using that to create a floating point value:
org $1000 .float -41.3417023 end
Checking the resulting bytes that were generated:
| FILE ff.a 1 | org $1000 1000: 86A55DE735 2 | .float -41.3417023 3 | end
And adjusting the floating point constant until I got bytes that matched:
| FILE ff.a 1 | org $1000 1000: 86A55DE728 2 | .float -41.341702110 3 | end
I figure it's “close enough.” The parsing code in the Color BASIC ROM is old and predates the IEEE-754 floating point standard, so a few different digits at the end I think is okay.
As a final check, I wrote the following bit of code to calculate and display -2π3/3!, display the pre-calculated result, as well as display the pre-calculated value of 2π:
include "Coco/basic.i" include "Coco/dp.i" CB.FSUBx equ $B9B9 ; FP0 = X - FP0 ; addresses for CB.FSUB equ $B9BC ; FP0 = FP1 - FP0 ; these routines CB.FADDx equ $B9C2 ; FP0 = X + FP0 ; from CB.FADD equ $B9C5 ; FP0 = FP1 + FP1 ; Color BASIC Unravelled II CB.FMULx equ $BACA ; FP0 = X * FP0 CB.FMUL equ $BAD0 ; FP0 = FP0 * FP1 CB.FDIVx equ $BB8F ; FP0 = X / FP0 CB.FDIV equ $BB91 ; FP0 = FP1 / FP0 CB.FP0fx equ $BC14 ; FP0 = X CB.xfFP0 equ $BC35 ; X = FP0 CB.FP1f0 equ $BC5F ; FP1 = FP0 CB.FP0txt equ $BDD9 ; result in X, NUL terminated org $4000 start ldx #tau ; point to 2*pi jsr CB.FP0fx ; copy to FP0 ldx #tau ; 2PI * 2PI jsr CB.FMULx ldx #tau ; 2PI * 2PI * 2PI jsr CB.FMULx jsr CB.FP1f0 ; copy fp acc to FP1 ldx #fact3 ; point to 3! jsr CB.FP0fx ; copy to FP0 jsr CB.FDIV ; FP0 = FP1 / FP0 neg CB.fp0sgn ; negate result by flippping FP0 sign jsr CB.FP0txt ; generate string bsr display ; display on screen ldx #answer ; point to precalculated result jsr CB.FP0fx ; copy to FP0 jsr CB.FP0txt ; generate string bsr display ; display ldx #tau ; now display 2*pi jsr CB.FP0fx ; just to see how close jsr CB.FP0txt ; it is. bsr display rts display.char jsr [CHROUT] ; display character display lda ,x+ ; get character bne .char ; if not NUL byte, display lda #13 ; go to next line jsr [CHROUT] rts tau .float 6.283185307 fact3 .float 3! answer .float -(6.283185307 ** 3 / 3!) end start
The results were:
-41.3417023 -41.3417023 6.23418531
The calculation results in -41.3417023 and the direct result stored in answer
also prints out -41.3417023,
so that matches and it reinforces my approach to this nominally right.
But I think Microsoft had issues with either generating some of the floating point constants for the larger terms, or transcribing the byte values of the larger terms. Take for instance -2π11/11!. The correct answer is -15.0946426, but the bytes in the ROM define the constant -14.3813907, a difference of .7. And it's not like Color BASIC can't calculate that correctly—when I typed in the expression by hand, it was able to come up with -15.0946426.
Or it could be that Walter K. Zydhek, the
author of Color BASIC Unravelled II,
is wrong in his interpretation of the expressions used to generate the values,
or his interpretation of what the values are used for.
I'm not sure who is at fault here.
Update on Friday, March 1st, 2024
I was wrong about the authorship of Color BASIC Unravelled II. It was not Walter K. Zydhek, but some unknown author of Spectral Associates, a company that is no longer in business. All Zydhek did was to transcribe a physical copy of the book (which is no longer available for purchase anywhere) into a PDF and make it available.
Discussions about this entry
Friday, March 01, 2024
The speed of Microsoft's BASIC floating point routines
I was curious about how fast Microsoft's BASIC floating point routines were. This is easy enough to test, now that I can time assembly code inside the assembler. The code calculates -2π3/3! using Color BASIC routines, IEEE-754 single precision and double precision.
First, Color BASIC:
.tron timing ms_fp ldx #.tau jsr CB.FP0fx ; FP0 = .tau ldx #.tau jsr CB.FMULx ; FP0 = FP0 * .tau ldx #.tau jsr CB.FMULx ; FP0 = FP0 * .tau jsr CB.FP1f0 ; FP1 = FP0 ldx #.fact3 jsr CB.FP0fx ; FP0 = 3! jsr CB.FDIV ; FP0 = FP1 / FP0 neg CB.fp0sgn ; FP0 = -FP0 ldx #.answer jsr CB.xfFP0 ; .answer = FP0 .troff rts .tau fcb $83,$49,$0F,$DA,$A2 .fact3 fcb $83,$40,$00,$00,$00 .answer rmb 5 fcb $86,$A5,$5D,$E7,$30 ; precalculated result
I can't use the .FLOAT
directive here since that only supports either the Microsoft format or IEEE-754 but not both.
So for this test,
I have to define the individual bytes per float.
The last line is what the result should be
(by checking a memory dump of the VM after running).
Also,
.tao
is 2π,
just in case that wasn't clear.
This ran in 8,742 cycles,
taking 2,124 instructions and 4.12 cycles per instruction
(I modified the assembler to record this additional information).
Next up, IEEE-754 single precision:
.tron timing ieee_single ldu #.tau ldy #.tau ldx #.answer ldd #.fpcb jsr REG fcb FMUL ; .answer = .tau * .tau ldu #.tau ldy #.answer ldx #.answer ldd #.fpcb jsr REG fcb FMUL ; .answer = .answer * .tau ldu #.answer ldy #.fact3 ldx #.answer ldd #.fpcb jsr REG fcb FDIV ; .answer = .answer / 3! ldy #.answer ldx #.answer ldd #.fpcb jsr REG fcb FNEG ; .answer = -.answer .troff rts .fpcb fcb FPCTL.single | FPCTL.rn | FPCTL.proj fcb 0 fcb 0 fcb 0 fdb 0 .tau .float 6.283185307 .fact3 .float 3! .answer .float 0 .float -(6.283185307 ** 3 / 3!)
The floating point control block (.fpcb
) configures the MC6839 to use single precision,
normal rounding and projective closure
(not sure what that is,
but it's the default value).
And it does calculate the correct result.
It's amazing that code written 42 years ago for an 8-bit CPU works flawlessly.
What it isn't is fast.
This code took 14,204 cycles over 2,932 instructions (average 4.84 cycles per instruction).
The higher than average cycle type could be due to position independent addressing modes, but I'm not entirely sure what it's doing to take nearly twice the time. The ROM does use the IEEE-754 extended format (10 bytes) internally, with more bit shifts to extract the exponent and mantissa, but twice the time?
Perhaps it's code to deal with ±∞ and NaNs.
The IEEE-754 double precision is the same,
except for the floating point control block configuring double precision and the use of .FLOATD
instead of .FLOAT
;
otherwise the code is identical.
The result,
however,
isn't.
It took 31,613 cycles over 6,865 instructions (average 4.60 cycles per instruction).
And being twice the size,
it took nearly twice the time as single precision,
which is expected.
The final bit of code just loads the ROMs into memory, and calls each function to get the timing:
org $2000 incbin "mc6839.rom" REG equ $203D ; register-based entry point org $A000 incbin "bas12.rom" .opt test prot rw,$00,$FF ; Direct Page for BASIC .opt test prot rx,$2000,$2000+8192 ; MC6839 ROM .opt test prot rx,$A000,$A000+8192 ; BASIC ROM .test "BASIC" lbsr ms_fp rts .endtst .test "IEEE-SINGLE" lbsr ieee_single rts .endtst .test "IEEE-DOUBLE" lbsr ieee_double rts .endtst
Really, the only surprising thing here was just how fast Microsoft BASIC was at floating point.
Monday, April 01, 2024
Notes on an overheard conversion while eating dinner at The Cracker Barrel
“Oh no. This is bad.”
“Wow! Are you sure there are no more moves you can make?”
“Nope. See?”
“How many pegs is that?”
“10.”
“Wow! You are really bad at that!”
[If you add up all the possible ways leaving just one peg in the Peg Game, including rotations, reflections and reflected rotations, you have 438,984 ways of solving the Peg Game. If you add up all the possible ways of leaving 10 pegs, including rotations, reflections and reflected rotations, you have just six solutions. It is a much harder problem leaving 10 pegs than leaving one. I'm just saying … —Editor]
Tuesday, April 02, 2024
It only took 25 years for my idea to catch on
I was exchanging emails with Christian about online document structure when I mentioned The Electric King James Bible
and it's rather unique addressing scheme.
I came up with that 25 years ago
[Good Lord! Has it been that long? —Sean]
[Yes. —Editor]
[Yikes! —Sean]
to precisely pull up Bible verses—anywhere from one verse to an entire book.
Of all the Bible sites across the Intarwebs I've come across since have never used such an elegant,
and to me,
obvious,
way of referencing the Bible online.
Usually they use a URL format like <https://bible.example.org/?bible=kj&book=Genesis&chapter=1&start_verse=1&end_verse=1
>.
But Christian mentioned Sefaria as using my method,
and true enough,
it does!
<https://www.sefaria.org/Genesis.6:9-9:17
> does indeed go to the Noah's Ark story.
I think that's neat!
I don't know if they were inspired by my site
(unlikely, but not completely out of the relms of possibility)
or just came up with it on their own,
but it's nice to see someone else is using an easy to hack URL form for Bible references.
There are differences though—my site only brings up the requested material, whereas Sefaria implements a bidirectional “Scroll Of Doom” where additional material appears when you go up or down. I can't say I'm a fan of that, but it apparently works for them.
Dear LinkedIn, why are you still asking me these questions?
LinkedIn is still asking me to participate as an expert answering questions—this time, “You're a system architect. How do you decide which programming languages to learn?” And just below that is “Powered by AI and the LinkedIn community.”
Sigh. Eu tu, LinkedIn?
I'm still tempted to answer, but no. I can't just bear to answer this how I would want to answer it. Besides, if you know where to look, you might find my answers anyway.
Wednesday, April 03, 2024
An excessive number of packaging layers
I ordered an item from Amazon the other day. The expected arrival time was Friday, but instead, it arrived today. On the front porch was an Amazon box, measuring 6″ × 9″ × 5″ (16cm × 23cm × 13cm for the more civilized amongst you). Inside was another box, 3″ × 4½″ × ⅜″ (7cm × 11cm × 1cm). Inside that was a slightly smaller anti-static bag. Inside that was a smaller plastic bad, and finally, inside that was the item I had purchased—a replacement battery for my old-school flip phone.
Seriously? Four layers of packaging? Sigh.
“Because this kind of battery is encrypted …”
So I'm reading the “Battery Replacement Installation Manual” for the battery I just bought and as translated instructions go, it's not that bad. But there are some choice bits though …
Why does the phone echo?
The echo of the phone may be due to the installation problem. Can you see if there are any loose parts, because the battery will not affect the quality of the phone's call unless there is no power and cause the phone shut down.
“The echo of the phone?”
Feedback? Hearing my own voice echoed back to me? Maybe?
Anyway, carrying on …
Why did I receive a swollen battery?
Because this kind of battery is encrypted …
I have no clue here. It states that swelling may occur if the temperature exceeds 158°F (70°C), and enter sleep mode if the temperature is too low, although it doesn't state what “too low” means. Fortunately, the battery I received isn't swollen, so I guess it's not encrypted?
4. Please carefully check whether there is any debris or screws falling into the battery area. If there is, please clean it up before proceeding to the next step, otherwise the sundries may pierce the battery and cause a short circuit and cause spontaneous combustion.
“Sundries.” Love it!
Thursday, April 04, 2024
Tracking down a bug
I've spent the past two days tracking down a bug, and I think it's a library issue.
So I have this program I wrote some time ago that uses Xlib and for reasons, I needed to store a 64-bit value that's related to a window. This is easy enough with setting a window property. The code for that is easy enough:
void svalue(Display *display,Window window,unsigned long long int value) { assert(display != NULL); assert(window != None); XChangeProperty( display, window, CALC_VALUE, XA_INTEGER, 32, /* format */ PropModeReplace, (unsigned char *)&value, sizeof(value) / 4 /* number of 'format' units */ ); }
CALC_VALUE
is the “variable”
(for lack of a better term)
and XA_INTEGER
is
(again, for lack of a better term)
the base type.
Yes,
this is just wrapping a single function call in a function,
but it's an abstraction to make things simpler as it's called from multiple locations in the codebase.
To query the value:
unsigned long long int qvalue(Display *display,Window window) { assert(display != NULL); assert(window != None); unsigned long long int value; Atom atom_got; unsigned char *plong; int rc = XGetWindowProperty( display, window, CALC_VALUE, 0, sizeof(unsigned long long int) / 4, False, XA_INTEGER, &atom_got, &(int){0}, /* this is don't care */ &(unsigned long int){0}, /* another don't care */ &(unsigned long int){0}, /* another don't care */ &plong ); if ((rc == Success) && (atom_got == XA_INTEGER)) { memcpy(&value,plong,sizeof(unsigned long long int)); XFree(plong); } else value = 0; return value; }
Again, nothing too horrible or tricky.
The code was originally written on a 32-bit system (just after I left The Enterprise), and it worked. I then wanted to get the program working on a 64-bit system (beacuse I want to both release it and talk about it). It worked, but only for values of 31-bits or less. As soon as the value hit 32-bits, the upper 32-bits were all 1s.
I added code to dump the value just before the call to XChangeProperty()
and code to dump the value just after the call to XGetWindowProperty()
and somewhere,
once the value was 0x00000000FFFFFFFF going into XChangeProperty()
,
it was 0xFFFFFFFFFFFFFFFF coming out of XGetWindowProperty()
.
32-bit version? No issues. 64-bit version? Issues.
I tried a different compiler,
on the off chance that I might be hitting some weird compiler bug,
and no go,
GCC or Clang,
both on the 64-bit system had the same issue.
I tried using a different X server and the same results—32 bit client, fine; 64-bit client, not fine.
So I think it's due to the client side on the 64-bit system where the issue lies.
Also,
if I change the call to XChangeProperty()
to:
void svalue(Display *display,Window window,unsigned long long int value) { assert(display != NULL); assert(window != None); XChangeProperty( display, window, CALC_VALUE, XA_INTEGER, 8, /* format, this time 8! */ PropModeReplace, (unsigned char *)&value, sizeof(value) /* still number of 'format' units */ ); }
That is, a format of 8 fixed the issue. Even a format of 16 worked. It's just that when I try to use a format of 32, on the 64-bit system, does it fail.
And using a format of 8 on the 32-bit system works as well, so at least I have a workaround for it. Still, it's annoying.
I love it when abstractions are too abstract to be useful
I recently found an annoying aspect of
Xlib—it's hard to find documentation about what keys affect the state
field of the keyboard event.
It's obvious that the shift keys on the keyboard will set ShiftMask
,
the control key will set ControlMask
,
and the CapsLock key will set LockMask
(when I would expect it to set ShiftMask
since it's just locking the shift keys to “on”),
but there's little to say what keys set the Mod1Mask
, Mod2Mask
, Mod3Mask
, Mod4Mask
and Mod5Mask
.
This is problematic,
because I do need to check for keyboard events and this threw me for a loop—why are none of the keys working?
Well,
that's because my virtual Linux server on the Mac sets the NumLock key,
which causes the X server to then set the Mod2Mask
for all keyboard events and I wasn't expecting that.
Sigh.
Friday, April 05, 2024
Matchbox cars seem to have gotten bigger in recent years
Bunny and I went to a local Toyota dealership to fix an issue with her car (it turns out it was a very unusual, but very minor, issue) and while there, we saw this on the display floor:
Turns out, this is not a large Matchbox car, but a small electric car straight from a factory in Japan (the informational flying under the windsheid is all in Japanese). A five year old would barely fit in this thing, much less an adult. There doesn't appear to be any storage space of any significant size, and sans doors, I'm not sure this is even road legal. And the the staff there don't even know if it's for sale. Weird.
Saturday, May 11, 2024
How to measure ⅚ cup of oil, part III
I just received a nice email from Muffintree14 thanking me for helping them make a recipe where they needed to meaure out ⅚ of a cup! They were trying to measure out 200ml of something (they didn't specify what) and it turns out that 200ml is about ⅚ of a cup. I suspect they could have just used a regular cup, as that's 237ml. As long as you aren't baking bread (or other pastry-like food item) then it probably doesn't matter that much. Roughly speaking, 200ml is close enough to 1 cup that you might as well use 1 cup.
But then I found an image (via Bob Anstett on TikLinkedMyFacePinInstaMeGramSpaceWeInTokTrestBook) describing the various relationships among Imperial units, and from there, I found a much better way to meaure ⅚ cups—measure out 1 cup, then remove 8 teaspoons; much better than the 2 ⅓ cup measures (or 1 ⅔ if available), a 1½ tablespoon and a ½ teaspoon. And maybe this will help someone else twenty years down the line.
Monday, May 13, 2024
Remembrance of enlightened palms past
The image at the bottom of this page reminds me of the time I used to photograph enlightened palms, but it never occurred to me that one could enlight trees with fireflies (we don't get fireflies down here in Lower Sheol, which may be the reason why). The pictures I took with the Christmas lights used an exposure of a few seconds; I wonder how long an exposure was used for the firefly photo.
Tesla, Edison, and who actually fought the War of Currents?
I used to think Thomas Edison was a self-aggrandizing business man who took the credit for the inventions his employees made, and Nikola Tesla was the real deal—a genius inventor who was actually responsible for most of our technology based on electricity. But now? Having watched the 4½ hour long video “Most Everything You Know About Nikola Tesla and Thomas Edison is Probably Wrong” (and yes, it's four and a half hours long!) I'm not so sure my assessment is correct. The long video goes deep into the history of Tesla, Edison, and the War of the Currents where it wasn't Tesla vs. Edison, but Westinghouse (the company) vs. Edison (the copmany).
Tesla might have been a genius, but not all this theories about physics and electronics were correct and later in life he went a bit … crazy … to say the least (he fell in love with a pidgeon and said he created incredible inventions without having actually … you know … built the incredible inventions). And Edison might have been a self-aggrandizing business man, but he credited his team and oftem times, his team didn't invent the technology, but improved upon existing designs (to the point where he learned 6,000 ways not to build a lightbulb).
And the whole thing about Edison electrocuting an elephant (or at least animals) to show how dangerous alternating current was? Eh … not exactly. And he did not invent the electric chair.
Yes, it's a long video, but if you are interested at all in Tesla and/or Edison, it's worth the time to watch. It got me to rethink how I think about Tesla and Edison.
Wednesday, May 15, 2024
Extreme Monopoly Board Game Knockoff, Boca Raton edition
About two weeks ago I was at a local Walgreens in Boca Raton when I came across something unusual. I meant to blog about it then, but alas, I just now got a round tuit.
Anyway, what I found:
I amazed this even exists! I wonder who's idea this even was? The Boca Raton Chamber of Commerce?
Anyway, it's clearly a knockoff of Monopoly, as you won't find it for sale at Hasbro. It's actually made by Late for the Sky, which seems to make games based off Monopoly, or should I say, The Landlord's Game which is completely in the public domain (wink wink nudge nudge say no more say no more, unlike Monopoly. But Boca Raton Opoly sure looks like Monopoly, walks like Monopoly, and probably quacks like Monopoly, so I wonder how they get away with this?
Perhaps by flying under the radar of Habro?
Update later this day
Apparently, Hasbro doesn't care:
Leaders at Late for the Sky say Monopoly gameplay is not copyrighted, meaning any version of the game can be created as long as the board, pieces and names within the game are different from the original version.
Via my friend Jeff Cuscutis on LinkedPinMyFaceTikInstaMeTrestWeGramBookInTokSpace, Business making Monopoly games based on Carolina towns
Monday, May 27, 2024
How does TLS use less CPU than plain TCP?
I have two services written in Lua—a gopher server and a Gemini server. The both roughly serve the same data (mainly my blog) and yet, the gopher server accumulates more CPU time than the Gemini server, despite that the Gemini server uses TLS and serves more requests. And not by a little bit either:
gopher | 17:26 |
Gemini | 0:45 |
So I started investigating the issue.
It wasn't TCP_NODELAY
(via Lobsters)
as latency wasn't the issue
(but I disabled Nagle's algorithm anyway).
Looking further into the issue, it seemed to be one of buffering. the code was not buffering any data with TCP; furthermore, the code was issuing tons of small writes. My thinking here was—Of course! The TCP code was making tons of system calls, whereas the TLS code (thanks to the library I'm using) must be doing buffering for me.
So I added buffering to the gopher server, and now, after about 12 hours (where I restarted both servers) I have:
gopher | 2:25 |
Gemini | 2:13 |
I … I don't know what to make of this. Obviously, things have improved for gopher, but did I somehow make Gemini worse? (I did change some low level code that both TCP and TLS use; I use full buffering for TCP, no buffering for TLS). Is the load more evenly spread?
It's clear that gopher is still accumulating more CPU time, just not as bad as it was. Perhaps more buffering is required? I'll leave this for a few days and see what happens.
Thursday, May 30, 2024
Profile results are as expected as the Spanish Inquisition
I'm not upset at rewriting the code that handles the network buffering as it needed work, but I'm still seeing a disporportionate amount of CPU time accumluate on the supposedly simpler protocol gopher. The most popular requests of both my gopher server and Gemini server are entries from my blog, so I take a look at the code that handles such requests for both servers. Yes, the gohper server has a bit more code dealing with links than the Gemini server (because gopher URLs are almost, but not entirely, like http URLs—and the small differences are annoying), but I'm not seeing anything that stands out. Yes, the code is not quite twice as much, but the CPU utilization is more than three times as much (as of writing this).
I have no other choice at this point and I constantly relearn this lession over and over again: if I'm looking into a performance issue, profile the code under question! Profile profile profile!
The code is in Lua and as it happens, I've profiled Lua code before. First, I want to answer this question: how much code does it take to serve a request? And I figure measuring the lines of code run is a good answer to that. I can get a baseline from that. And the code to answer that is a very easy four line change to each server:
local function main(iostream) local count = 0 debug.sethook(function() count = count + 1 end,'line') -- The rest of the main code debug.sethook() syslog('notice',"count=%d",count) end
I fire up the servers locally, make a decently sized request to each, and I get my results:
gopher | 457035 |
gemini | 22661 |
WHAT THE LITERAL XXXX‽
[Well, there's your problem! —Editor] [Just … gaaaaaaah! —Sean]
I'm constantly surprised at the results of profiling—it's almost never what I think the problem is. And here, it's clear that I messed up pretty bad somewhere in the gopher code.
Now off to more profiling to see where it all goes pear shaped.
Unicode. Why did it have to be Unicode?
Well, I have my answer. I first found a smaller request that exhibits the behavior as to not generate half a million lines of output:
gopher | 2549 |
gemini | 564 |
Good. Two and a half thousand lines of code is tractable. Now, just to show how easy it is to profile Lua code, here's the code I'm using for my gopher server:
local profile = {} local function doprofile() local info = debug.getinfo(2) local name = info.name or "..." local file = info.source or "@" local key = string.format("%s$%s(%d)",file,name,info.currentline) if not profile[key] then profile[key] = 1 else profile[key] = profile[key] + 1 end end
For each line of code executed, we get the filename, the function name and the line of code that's executing, turn that into a key, and use that to count the number of times that line of code is executed. Easy. And then some code to dump the results:
local function cleanup() local results = {} for name,value in pairs(profile) do results[#results + 1] = { file = name , count = value } end table.sort(results,function(a,b) if a.count > b.count then return true elseif a.count < b.count then return false else return a.file < b.file end end) local f = io.open("/tmp/dump.txt","w") for i = 1 , #results do f:write(string.format("%6d %s\n",results[i].count,results[i].file)) end f:close() end
We sort the results based on line count, then alphabetically by key. And like before:
local function main(iostream) debug.sethook(doprofile,'line') -- The rest of the main code debug.sethook() cleanup() end
I make the request and get some results:
215 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(202) 211 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(203) 211 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(204) 211 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(268) 210 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(282) 210 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(283) 169 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(219) 169 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(224) 169 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(239) 169 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(240) 42 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(205) 42 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(206) 42 @/usr/local/share/lua/5.4/org/conman/string.lua$wrapt(207) 17 @port70.lua$...(272) 17 @port70.lua$...(273) 9 @/usr/local/share/lua/5.4/org/conman/net/ios.lua$write(547) ...
Oh.
Yeah.
That.
Obvious in hindsight.
I completely forgot about that.
Okay.
The function in question,
wrapt()
,
wraps text and it's a rather heavy function
due to Unicode
(and I'm not even following the full specification there).
This is the major difference between the gopher and Gemini servers—I don't wrap text for Gemini
(the clients handle that).
I guess I'll have to drop down to C if I want to speed this up.
Sigh.
Saturday, June 01, 2024
Falettinme Be Mice Elf Ag
Bunny and I were in my car when this song from the Sly and the Family Stone came on. On the screen was the title, “Thank You (Falettinme Be Mice Elf Ag)”. Bunny attempted to read it and was having a hard time with it; I suggested that perhaps XM Radio had used some form of audio transcription that went wrong somewhere, because Sly and the Family were clearly singing “thank you for letting me be myself.”
I was about to make a satirical post about it, but when I was looking up the song … well … that's the actual name! Okay, it's actually “Thank You (Falettinme Be Mice Elf Agin)” (the screen wasn't wide enough for the entire title). But still … that's the actual name of the song! Sheesh!
Monday, June 03, 2024
Just a simple matter of replacing a slow Lua function with a faster C function
I spent the past few days rewriting some Lua code into C. While I find LPEG to be convenient, it is not necessarily fast. Normally this isn't an issue but in this case, I was calling LPEG for each character in a blog post.
Fortunately,
it was fairly straight forward porting the code to C.
The code goes through text a character codepoint at a time.
If it's a whitespace character or a hyphen,
I mark the current position as a possible breakpoint for the text;
otherwise I ignore combining characters
(they don't count towards the line length).
Then,
when I reach past the number of characters I want for a line,
I copy out the string from the beginning of the “line” to the marked breakpoint
(and if there isn't a breakpoint,
there is no good place to break the line so I will break the line at the line length—not much else to do),
then mark the beginning of the next line and continue until the end of the text.
The hardest part was figuring out how to classify each character I needed. In the end, I pull out each Unicode codepoint from UTF-8 and look through an array to classify the codepoint as whitespace, a hyphen or a combining character; if they aren't in the table, it just a normal character.
As a sanity check, I reran the original profiling test:
gopher (original) | 457035 |
gopher (new) | 18246 |
gemini (just because) | 22661 |
Much better.
And most of the 457,035 lines of code being executed are now hidden behind C.
Now to make sure the code is actually faster,
I profiled the new wrapt()
function:
local wraptx = wrapt local function wrapt(...) local start = rdtsc() local res = wraptx(...) local stop = rdtsc() syslog('notice',"wrapt()=%d",stop-start) return res end
with the decently sized request I used before (each line is a call to wrapt()
):
#Lua code | C code |
43330 | 11810 |
43440 | 12000 |
45300 | 12220 |
48100 | 12020 |
48680 | 13690 |
49260 | 12650 |
54140 | 12270 |
54650 | 12460 |
58530 | 12130 |
59760 | 14180 |
61100 | 15480 |
65440 | 14970 |
67920 | 15810 |
68750 | 15310 |
69920 | 17170 |
69960 | 17780 |
70740 | 16510 |
75640 | 16750 |
78870 | 19170 |
83200 | 18190 |
87090 | 17290 |
89070 | 23360 |
91440 | 19560 |
101800 | 21520 |
102460 | 21060 |
103790 | 22180 |
106000 | 22400 |
106010 | 21870 |
112960 | 21160 |
115300 | 21870 |
115980 | 23130 |
118690 | 24980 |
122550 | 23960 |
122710 | 24550 |
127610 | 23830 |
129580 | 24670 |
130120 | 24930 |
140580 | 26570 |
141930 | 25210 |
157640 | 27050 |
168000 | 32250 |
Excellent! The new code is three to five times faster. Now to just sit back and see how the new code fares over the next few days.
Wednesday, June 05, 2024
I wonder who isn't getting the point
Yes, there are a few groups I follow on Discord. One of the groups has a link to another site I want to view and I click on it. I get:
Leaving Discord
This link is taking you to the following website
https://github.com/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Trust
github.com
links from now on
God, I just love XXXXXXX wall gardens! I just love how it's there to protect me. You mean … a link … goes to a non-Discord site? Oh my God! Thank you for warning me! My computer might have been pwned by Microsoft!
Sheesh.
Thursday, June 06, 2024
Stats on some optimized small internet servers
I restarted my Gopher and Gemini servers about 25 hours ago to ensure they're both running the new optimized code, so now let's check some results:
service | CPU utilization | requests | bytes |
---|---|---|---|
gopher | 2:42 | 935 | 1186591 |
gemini | 8:58 | 36394 | 249020812 |
That's more like it. The gopher server, running over plain TCP and getting about 1/40th the requests, is finally accumulating less CPU time than Gemini (and the only reason it's not even less is my gopher server has to deal with wraping Unicode text).
I also find it amusing that Gemini, a protocol that has only been around for five years, is way more popular than gopher, a protocol that's been around for thirty-three years. I guess Gemini being the new shiny is the reason.
Wednesday, June 12, 2024
Just when I thought it was safe to run a gopher server
It appeared I had traded the problem of high CPU usage for a memory leak. Last night I saw:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND gopher 31875 0.2 13.3 2174076 2171960 ttyp4 S+ Jun09 7:08 lua port70.lua /home/spc/gopher/config.lua
(For the record—the VSZ
and RSS
values are the number of 4,096 byte blocks of memory for various reasons beyond the scope of this post)
and just before lunch today:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND gopher 1572 0.2 11.1 1809672 1807644 ttyp4 S+ 04:30 1:07 lua port70.lua /home/spc/gopher/config.lua
Not good. Especially when it's normally like:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND gopher 20711 0.2 0.0 10588 9000 ttyp4 S+ 17:52 0:06 lua port70.lua /home/spc/gopher/config.lua
And then there was this in the logs:
Jun 12 09:47:54 daemon err 71.19.142.20 gopher CRASH: coroutine thread: 0x8400404 dead: not enough memory Jun 12 09:47:54 daemon err 71.19.142.20 gopher CRASH: thread: 0x8400404: stack traceback: Jun 12 09:47:54 daemon err 71.19.142.20 gopher CRASH: thread: 0x8400404: [C]: in function 'org.conman.string.wrapt' Jun 12 09:47:54 daemon err 71.19.142.20 gopher CRASH: thread: 0x8400404: ...e/lua/5.4/org/conman/app/port70/handlers/blog/format.lua:34: in upvalue 'wrap_text' Jun 12 09:47:54 daemon err 71.19.142.20 gopher CRASH: thread: 0x8400404: ...e/lua/5.4/org/conman/app/port70/handlers/blog/format.lua:119: in upvalue 'run_flow' Jun 12 09:47:54 daemon err 71.19.142.20 gopher CRASH: thread: 0x8400404: ...e/lua/5.4/org/conman/app/port70/handlers/blog/format.lua:598: in function 'org.conman.app.port70.handlers.blog.format' Jun 12 09:47:54 daemon err 71.19.142.20 gopher CRASH: thread: 0x8400404: ...al/share/lua/5.4/org/conman/app/port70/handlers/blog.lua:227: in function <...al/share/lua/5.4/org/conman/app/port70/handlers/blog.lua:210> Jun 12 09:47:54 daemon err 71.19.142.20 gopher CRASH: thread: 0x8400404: (...tail calls...) Jun 12 09:47:54 daemon err 71.19.142.20 gopher CRASH: thread: 0x8400404: port70.lua:238: in function <port70.lua:205>
Yeah, “not enough memory” while running the new C-based text wrapping function.
I added a few checks to see if wrapt()
wasn't updating the character indicies correctly
(and log if they weren't)
then went for lunch.
Afterwards,
I check and lo',
memory was still growing,
and none of the checks I added had triggered.
Hmmm. Time to take down the gopher server and debug some code.
Sure enough, I found the one entry that went off the rails. What followed was a few hours of troubleshooting to find out why that one entry (and only that entry) blows up. And it came down to a difference of semantics between Lua and C (no, it had nothing to do with C's 0-based indexing and Lua's 1-based indexing). Basically, the Lua code looked like:
local breakhere -- this is nil ... if ctype == 'space' then breakhere = i -- this is now an integer ... end if breakhere then table.insert(result,s:sub(front,breakhere - 1)) ... end breakhere = nil
In Lua,
only the values nil
and false
are considered “false”—a 0 value is “true” in Lua.
In C,
any 0 value is considered “false.”
I knew this when translating the code,
but it never occurred to me that a break point of a line could be at the start of a line.
Until it was at the start of a line in that one entry—the code went into an infinite loop trying to wrap text, thus causing the memory usage to consume the computer.
Sigh.
It was easy enough to fix once the problem was diagnosed. And maybe now things will get back to normal.
Monday, June 17, 2024
When a file doesn't exist, yet it's right there
For reasons, I'm downloading a file from several hundred different domains. The results start trickling in and I decide to take a look at one of them:
[spc]lucy:/tmp/cache>ls 04d.co 04d.co [spc]lucy:/tmp/cache>more 04d.co Error: File or directory not found! [spc]lucy:/tmp/cache>
Um … what?
Okay. There could be an invisible character or two in the filename.
[spc]lucy:/tmp/cache>echo * | hex | more 00000000: 30 34 64 2E 63 6F 20 31 34 33 36 2E 6E 69 6E 6A 04d.co 1436.ninj ... [spc]lucy:/tmp/cache>
Nope. But what could cause the shell not to find the file when the file is right there! The name shouldn't matter. But I can't view it.
[spc]lucy:/tmp/cache>cat 04d.co Error: File or directory not found! [spc]lucy:/tmp/cache>
Okay, what if I try to move it somewhere?
[spc]lucy:~/tmp>cp /tmp/cache/04d.co . [spc]lucy:~/tmp>more 04d.co Error: File or directory not found! [spc]lucy:~/tmp>
I can list the file:
[spc]lucy:~/tmp>ls -l 04d.co -rw-r--r-- 1 spc spc 37 Jun 17 22:25 04d.co [spc]lucy:~/tmp>
Let me try one last thing …
[spc]lucy:~/tmp>hex 04d.co 00000000: 45 72 72 6F 72 3A 20 46 69 6C 65 20 6F 72 20 64 Error: File or d 00000010: 69 72 65 63 74 6F 72 79 20 6E 6F 74 20 66 6F 75 irectory not fou 00000020: 6E 64 21 0D 0A nd!..
I was downloading these files from the Intenet. This particular site didn't have the file. The contents of the file is the error message.
Head. Meet desk.
This reminds me of college.
Back then,
the command finger
was popular,
and when using it,
the program would attempt to locate a file called .plan
in the home directory and display it.
I made a joke .plan
file that read:
Bus error - core dumped You have new mail.
I can't count the number of times someone tried to finger
me,
only to attempt to remove a non-existent core file and impulsively check their email.
Sunday, June 23, 2024
How does TLS use less CPU than plain TCP, part II
I noticed over the past couple of days that the CPU utilization were similar between the two services, but looking at the logs it seemed my gopher server git hit with several bots. So last night just as the logs reset (a weekly job) I decided to restart the two services so as to match CPU utilization and requests. Twelve hour later and …
CPU | requests | |
---|---|---|
gopher | 0:28 | 175 |
gemini | 0:26 | 1744 |
A week and a half after calling “Mission Complete” and apparently not. Even worse, my gopher server is using a bit more CPU than Gemini server while getting one-tenth the number of requests. Something is off here.
I checked the requests and both receive over 90% requests for my blog, so I'm confident that it's something to do with that. Both use the same request framework, both use the same code to read blog entries, both use the same code to parse the HTML, so it's not that. The only difference left is the formatting of each entry. The gopher server does wrap the text to 80 columns (not needed for the Gemini server), and there's quite a bit more code to deal with fixing the links … are those the causes? It's hard to determine what's causing the excessive CPU utilization.
I'm going to have to think about how to determine the issue.
Heck, I'm going to have to think if this is even worth determining. I mean, the CPU utilization is better, but it's still puzzling.
The case of the weird profiling results
Okay, I'm flummoxed.
After a bit of thought, I had an idea on how to determine where the CPU time is going on my gopher server—I have at least two working hypotheses that I can test, wraping text and translating links. So, to test these, I will get a list of requests since the logs rotated earlier today, and then run them through the gopher server (a second one set up on my public server just for these tests) to get a baseline; then remove the link translations and run the requests again; then remove wraping the text and run the requests a third time and see what I get.
For the first test, the expected results should result with the test gopher server having the same CPU utilization as the one that normally runs. It makes sense right—the normal one has x CPU utilization with y requests, so if I feed the test gopher server the same requests (even though all the requests come from the same location) it should have the same CPU utilization. Or at least close enough to show that I'm on the right track.
When I pulled the requests out of the logs files and checked the current gopher server, it had received 332 requests and racked up 0:47 in CPU time. I set up the test gopher server (the only changes to the configuration—different log file and a different port which meant I didn't have to run as root). I made the 332 requests to my test gopher server and I get a CPU time of 0:22.
What?
I reran this test a few times and got the same results each time—0:22.
Which is less than half the time the normal gopher server with the same number of requests. If it was one or two seconds off, hey, close enough. But half?
It's not making sense.
But I decided to coninue and run the other tests. First, I removed the code that does the link translation, rerand the requests and got a CPU time of 0:20.
Okay, that tell me two things—one, the link translations do take time, but I don't think it's enough to explain the CPU utilization of the gopher server. Maybe. On a whim, I decided to change the third test to not even bother with processing blog entries—I modified the blog entry handler to just return. Given that ¾ of the requests are to the blog, this should make it run much faster and use less CPU time.
I got a CPU time of 0:18.
I don't know what to think. I would have expected this to be 0:05 or 0:06, given that 75% of the requests would not be generated. Something weird is going on.
It was time to take a step back. I went back to the original bench mark but instead of counting lines executed, I decided to count Lua VM instructions for the decently sized request.
gopher | 61622 |
gemini | 67401 |
Okay, the gopher server is clearly doing less Lua VM instructions than the Gemini server. Could I gain any insight from profiling at the C level? I had already done most of the work to profile both the gopher and Gemini servers. To make sure I got enough data, I ran the decently sized request three times for each server.
% time | cumulative seconds | self seconds | calls | self seconds ms/call | total ms/call | name |
---|---|---|---|---|---|---|
% time | cumulative seconds | self seconds | calls | self seconds ms/call | total ms/call | name |
13.79 | 0.04 | 0.04 | 185781 | 0.00 | 0.00 | luaV_execute |
10.34 | 0.07 | 0.03 | 734588 | 0.00 | 0.00 | index2value |
6.90 | 0.09 | 0.02 | 398225 | 0.00 | 0.00 | luaS_new |
6.90 | 0.11 | 0.02 | 45028 | 0.00 | 0.00 | luaH_newkey |
3.45 | 0.12 | 0.01 | 1041939 | 0.00 | 0.00 | yymatchChar |
3.45 | 0.13 | 0.01 | 924394 | 0.00 | 0.00 | luaG_traceexec |
3.45 | 0.14 | 0.01 | 503889 | 0.00 | 0.00 | yyText |
3.45 | 0.15 | 0.01 | 260252 | 0.00 | 0.00 | luaD_precall |
3.45 | 0.16 | 0.01 | 245893 | 0.00 | 0.00 | mainpositionTV |
3.45 | 0.17 | 0.01 | 201753 | 0.00 | 0.00 | auxgetstr |
3.45 | 0.18 | 0.01 | 191931 | 0.00 | 0.00 | yy_S |
3.45 | 0.19 | 0.01 | 185373 | 0.00 | 0.00 | equalkey |
3.45 | 0.20 | 0.01 | 134088 | 0.00 | 0.00 | yyDo |
3.45 | 0.21 | 0.01 | 129432 | 0.00 | 0.00 | yy_CHAR |
3.45 | 0.22 | 0.01 | 101937 | 0.00 | 0.00 | reverse |
3.45 | 0.23 | 0.01 | 34759 | 0.00 | 0.00 | luaH_getn |
3.45 | 0.24 | 0.01 | 4473 | 0.00 | 0.00 | getfirst |
3.45 | 0.25 | 0.01 | 1625 | 0.01 | 0.01 | traverseproto |
3.45 | 0.26 | 0.01 | 834 | 0.01 | 0.01 | strcore_wrapt |
3.45 | 0.27 | 0.01 | 61 | 0.16 | 0.16 | checkloops |
3.45 | 0.28 | 0.01 | 2 | 5.00 | 5.00 | deletelist |
3.45 | 0.29 | 0.01 | cclasscmp | |||
0.00 | 0.29 | 0.00 | 924377 | 0.00 | 0.00 | luaD_hook |
0.00 | 0.29 | 0.00 | 391575 | 0.00 | 0.00 | yymatchString |
0.00 | 0.29 | 0.00 | 358374 | 0.00 | 0.00 | luaH_getshortstr |
0.00 | 0.29 | 0.00 | 261889 | 0.00 | 0.00 | prepCallInfo |
0.00 | 0.29 | 0.00 | 261296 | 0.00 | 0.00 | luaD_poscall |
% time | cumulative seconds | self seconds | calls | self seconds ms/call | total ms/call | name |
---|---|---|---|---|---|---|
% time | cumulative seconds | self seconds | calls | self seconds ms/call | total ms/call | name |
9.38 | 0.03 | 0.03 | 1011065 | 0.00 | 0.00 | luaG_traceexec |
9.38 | 0.06 | 0.03 | 1011056 | 0.00 | 0.00 | luaD_hook |
7.81 | 0.09 | 0.03 | 204707 | 0.00 | 0.00 | luaV_execute |
6.25 | 0.10 | 0.02 | 443861 | 0.00 | 0.00 | luaS_new |
6.25 | 0.12 | 0.02 | 396470 | 0.00 | 0.00 | luaH_getshortstr |
3.12 | 0.14 | 0.01 | 668980 | 0.00 | 0.00 | index2value |
3.12 | 0.14 | 0.01 | 391575 | 0.00 | 0.00 | yymatchString |
3.12 | 0.15 | 0.01 | 271008 | 0.00 | 0.00 | mainpositionTV |
3.12 | 0.17 | 0.01 | 243043 | 0.00 | 0.00 | luaD_precall |
3.12 | 0.17 | 0.01 | 242834 | 0.00 | 0.00 | moveresults |
3.12 | 0.18 | 0.01 | 217137 | 0.00 | 0.00 | ccall |
3.12 | 0.20 | 0.01 | 202203 | 0.00 | 0.00 | hookf |
3.12 | 0.20 | 0.01 | 129432 | 0.00 | 0.00 | yy_CHAR |
3.12 | 0.21 | 0.01 | 56698 | 0.00 | 0.00 | llex |
3.12 | 0.23 | 0.01 | 39543 | 0.00 | 0.00 | internshrstr |
3.12 | 0.23 | 0.01 | 30301 | 0.00 | 0.00 | luaM_malloc_ |
3.12 | 0.24 | 0.01 | 23821 | 0.00 | 0.00 | luaH_realasize |
3.12 | 0.26 | 0.01 | 5906 | 0.00 | 0.00 | luaV_concat |
3.12 | 0.27 | 0.01 | 4149 | 0.00 | 0.00 | GCTM |
3.12 | 0.28 | 0.01 | 1872 | 0.01 | 0.02 | yy_PCDATA |
3.12 | 0.28 | 0.01 | 928 | 0.01 | 0.01 | match |
3.12 | 0.29 | 0.01 | 787 | 0.01 | 0.01 | luaF_newLclosure |
3.12 | 0.30 | 0.01 | 595 | 0.02 | 0.02 | luaK_int |
3.12 | 0.32 | 0.01 | 59 | 0.17 | 0.17 | l_strcmp |
1.56 | 0.32 | 0.01 | 12 | 0.42 | 0.42 | luaV_finishOp |
0.00 | 0.32 | 0.00 | 1041939 | 0.00 | 0.00 | yymatchChar |
This is not easy to interpret.
As expected,
the Lua VM shows up in the top spots for both,
but nothing really stands out.
It is nice to see that yymatchChar
and yymatchString
(both in the HTML parsing module)
are called the same number of times
(expected)
but the times are different.
The Lua function index2value
is called a different number of times,
but that might be due to code differences.
I think the higher percentage of time in the gopher server might be due to taking less time overall?
For instance,
in the gopher server,
deletelist
has a self ms/call of 5,
but on the Gemini server it has a self ms/call of 0.52—is the garbage collector being called more often in the gopher server?
Or is it because of less time overall,
it shows up with a higher time?
It's only called twice in both codebases.
I'm not sure how to interpret this.
Somehow, the gopher server is doing less work than the Gemini server, yet accumulating more CPU time than the Gemini server, despite getting about 10% of the requests as the Gemini server. I don't understand how that can be.
I think I'll leave things alone for now. I went several years with a not-optimized gopher server, so I think I can live with this new version for now.
Thursday, June 27, 2024
IT HAS BEGUN! (“It” being Christmas themed advertising)
Last night at the Cracker Barrel, Bunny and I saw store displays nominally for Halloween. But nestled in the middle of the display was an evergreen tree with ornaments hanging off it. Try as you might, Cracker Barrel, there is no such thing as a Halloween Tree (no disrepect, Mr. Bradury)—it's still a Chrismtas Tree.
And then today I saw what is to me, the first official advertisement for Christmas—a trailer for a movie called “Red One,” a muddy looking CGI slugfest where Santa Clause has been kidnapped and needs to be saved.
The movie isn't even slated to be released until November! Couldn't they have at least waited until October to start advertising it? No?
Sigh.
Sunday, June 30, 2024
Four lines of code … it was four lines of code
I checked last night, and found the following:
CPU | requests | |
---|---|---|
gopher | 28:37 | 8360 |
gemini | 6:02 | 24345 |
How can a TCP service take more time than a TLS service? Especially when TLS runs over TCP? And TLS generates six times the packets than TCP?
There's some difference in how I handle TCP and TLS that's causing this. Profiling the code didn't reveal much as my attempts kept showing that gopher should be using less CPU than Gemini, so then it was just a matter of staring at code and thinking hard.
It wasn't easy. Yes, there are some differences between the TCP interface and the TLS interface, so it was a matter of going through the code, bringing them closer in implementation and seeing if that fixed things. I would fix the code, run a test, see it not make a difference, and try again. Eventually, I came across this bit of code:
-- --------------------------------------------------------------------- -- It might be a bug *somewhere*, but on Linux, this is required to get -- Unix domain sockets to work with the NFL driver. There's a race -- condition where writting data then calling close() may cause the -- other side to receive no data. This does NOT appoear to happen with -- TCP sockets, but this doesn't hurt the TCP side in any case. -- --------------------------------------------------------------------- while self.__socket.sendqueue and ios.__socket.sendqueue > 0 do nfl.SOCKETS:update(self.__socket,'w') coroutine.yield() end
This bit of code wasn't in the TLS implementation, and as the comment says, it “shouldn't” hurt the TCP path, but hey, it was only needed for local (or Unix) domain sockets to begin with, so let me try removing that bit of code and run a test.
CPU | real | user | sys | |
---|---|---|---|---|
gopher | 18 | 46.726 | 13.971 | 4.059 |
gemini | 4 | 107.928 | 3.951 | 0.322 |
CPU | real | user | sys | |
---|---|---|---|---|
gopher | 0 | 52.403 | 0.635 | 0.185 |
gemini | 4 | 103.290 | 3.957 | 0.285 |
THAT'S more like it!
Now, about that code … this is my best guess as to what was going on.
The sendqueue
field is a Linux-only field,
so I'm checking to see if it exists,
and then checking the value
(if only to avoid the “attempt to compare number with nil” Lua error on non-Linux systems).
This returns the number of bytes the kernel has yet to send,
so we switch the socket to trigger an event when we can write to it,
and yield.
Even worse,
the check isn't just a simple variable fetch,
but ends up being an ioctl()
system call.
Twice!
Now the “write” event code:
if event.write then if #ios.__output > 0 then local bytes,err = ios.__socket:send(nil,ios.__output) if err == 0 then ios.__wbytes = ios.__wbytes + bytes ios.__output = ios.__output:sub(bytes + 1,-1) else syslog('error',"socket:send() = %s",errno[err]) nfl.SOCKETS:remove(ios.__socket) nfl.schedule(ios.__co,false,errno[err],err) end end if #ios.__output == 0 then nfl.SOCKETS:update(ios.__socket,'r') nfl.schedule(ios.__co,true) end end
There's probably no data to be sent (ios.__output
is empty) so we immediately switch the socket back to trigger for readability,
then resume the coroutine handling the socket.
It goes back to the while
loop,
setting the socket back to trigger on writability,
and kept bouncing like that until the kernel buffer had been sent.
As I wrote, this fixed an issue I had with local domain sockets—it wasn't an issue with an IP socket, and in fact, was the wrong thing to do for an IP socket.
And as it is with these types of bugs, finding the root cause is not trivial, but the fix almost always is.
Discussions about this entry
- Four lines of code … it was four lines of code | Lobsters
- Four lines of code it was four lines of code | Hacker News
- Four lines of code it was four lines of code - ZeroBytes
Thursday, July 04, 2024
Something something Independence something something Declaration something something Blow Stuff Up
It just now became the Forth of July™ and earlier this evening (which was technically yesterday, Wednesday, July 3rd) some people were illegally lighting off fireworks already. I guess as a random quality test of the munitions they've aquired. So to them, and to you, I wish you a happy and safe National Blow Stuff Up Day!
And for the curious, the above image was taken by me 21 years ago. Yikes!
And it was 13 years ago when my hair nearly caught on fire, so when I say “be safe,” I mean it.
Sunday, July 21, 2024
Fixing an Apache pthread error
After I fixed the performance issue, I started looking around for some other issues to handle, and boy, did I find some. Checking the error log from Apache I found:
[Fri Jul 12 15:04:01.762845 2024] [mpm_worker:alert] [pid 31979:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread libgcc_s.so.1 must be installed for pthread_cancel to work [Fri Jul 12 15:04:02.731958 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 31979 exit signal Aborted (6) [Fri Jul 12 15:04:02.735360 2024] [mpm_worker:alert] [pid 32021:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread libgcc_s.so.1 must be installed for pthread_cancel to work [Fri Jul 12 15:04:03.733536 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32021 exit signal Aborted (6) [Fri Jul 12 15:04:03.736857 2024] [mpm_worker:alert] [pid 32063:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread libgcc_s.so.1 must be installed for pthread_cancel to work [Fri Jul 12 15:04:04.735368 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32063 exit signal Aborted (6) [Fri Jul 12 15:04:04.738624 2024] [mpm_worker:alert] [pid 32105:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread libgcc_s.so.1 must be installed for pthread_cancel to work [Fri Jul 12 15:04:05.737141 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32105 exit signal Aborted (6) [Fri Jul 12 15:04:05.740622 2024] [mpm_worker:alert] [pid 32147:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread libgcc_s.so.1 must be installed for pthread_cancel to work [Fri Jul 12 15:04:06.739077 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32147 exit signal Aborted (6) [Fri Jul 12 15:04:06.742500 2024] [mpm_worker:alert] [pid 32189:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread libgcc_s.so.1 must be installed for pthread_cancel to work [Fri Jul 12 15:04:07.740898 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32189 exit signal Aborted (6) [Fri Jul 12 15:04:07.744130 2024] [mpm_worker:alert] [pid 32231:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread libgcc_s.so.1 must be installed for pthread_cancel to work [Fri Jul 12 15:04:08.742689 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32231 exit signal Aborted (6) [Fri Jul 12 15:04:08.745885 2024] [mpm_worker:alert] [pid 32274:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread libgcc_s.so.1 must be installed for pthread_cancel to work [Fri Jul 12 15:04:09.744445 2024] [core:notice] [pid 19646:tid 4149040832] AH00052: child pid 32274 exit signal Aborted (6) [Fri Jul 12 15:04:09.747607 2024] [mpm_worker:alert] [pid 32316:tid 3924761520] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread libgcc_s.so.1 must be installed for pthread_cancel to work
Thousands of such lines.
And the weird thing is that libgcc_s.so.1
does exist on my system.
I guess I missed that library when I installed Apache from source.
I'm not even curious as to why this is an issue,
nor why libgcc_s.so.1
is needed.
At this point I'm like “give the system what it wants,
not what I want to give it”
(which I can't say in polite company).
Running configure –help
didn't show any obvious means of enabling the use of libgcc_s.so.1
,
and not wanting to dive deep into a maze of twisty Makefiles
all alike,
I decided on the next best thing.
I went to the Apache build directory,
deleted the existing httpd
binary and ran make
.
This gave me the final line used to build the executable:
/home/spc/apps/httpd-2.4.54/srclib/apr/libtool --silent --mode=link gcc -std=gnu99 -g -O2 -pthread -o httpd modules.lo buildmark.o -export-dynamic server/libmain.la modules/core/libmod_so.la modules/http/libmod_http.la server/mpm/worker/libworker.la os/unix/libos.la -L/usr/local/lib -lpcre /home/spc/apps/httpd-2.4.54/srclib/apr-util/libaprutil-1.la -lexpat /home/spc/apps/httpd-2.4.54/srclib/apr/libapr-1.la -luuid -lrt -lcrypt -lpthread -ldl
I then reran just this command but with /lib/libgcc_s.so.1
added before the -pthread
option.
A quick check afterwards:
[spc]brevard:~/apps/httpd-2.4.54>ldd .libs/httpd linux-gate.so.1 => (0xf77e1000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf77cf000) libpcre.so.1 => /usr/local/lib/libpcre.so.1 (0xf77b1000) libaprutil-1.so.0 => /usr/local/apache2/lib/libaprutil-1.so.0 (0xf778c000) libexpat.so.0 => /usr/lib/libexpat.so.0 (0xf776e000) libapr-1.so.0 => /usr/local/apache2/lib/libapr-1.so.0 (0xf7742000) libuuid.so.1 => /lib/libuuid.so.1 (0xf773f000) librt.so.1 => /lib/tls/librt.so.1 (0xf772a000) libcrypt.so.1 => /lib/libcrypt.so.1 (0xf76fc000) libpthread.so.0 => /lib/tls/libpthread.so.0 (0xf76ea000) libdl.so.2 => /lib/libdl.so.2 (0xf76e6000) libc.so.6 => /lib/tls/libc.so.6 (0xf75bb000) /lib/ld-linux.so.2 (0xf77e2000)
and it worked—libgcc_s.so.1
is linked in!
I installed the new Apache executable, restarted Apache and … it didn't crash! I then left it to run.
That was a week ago,
and so far,
so good.
The error log is rotated weekly and for the past week no such errors have appeared.
Now only if I could nuke from orbit the crawlers sending in silly requests
like /cgi-bin/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/bin/sh
—it just clutters up the error log
with “invalid URI path” errors.
Sigh.
Saturday, July 27, 2024
Fixing more Apache errors
A week later and I can't say I cleared up all the errors with my web server:
[Sat Jul 27 09:51:45.349454 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Sat Jul 27 09:51:45.349617 2024] [cgid:error] [pid 7348:tid 3807226800] [client 192.200.113.155:55207] End of script output before headers: boston.cgi [Sat Jul 27 09:51:45.350209 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Sat Jul 27 09:51:45.350297 2024] [cgid:error] [pid 7348:tid 3807226800] [client 192.200.113.155:55207] End of script output before headers: boston.cgi [Sat Jul 27 09:51:45.352660 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Sat Jul 27 09:51:45.352814 2024] [cgid:error] [pid 7636:tid 3815619504] [client 192.200.113.155:49997] End of script output before headers: boston.cgi [Sat Jul 27 09:51:45.353377 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Sat Jul 27 09:51:45.353462 2024] [cgid:error] [pid 7636:tid 3815619504] [client 192.200.113.155:49997] End of script output before headers: boston.cgi [Sat Jul 27 09:51:45.353790 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Sat Jul 27 09:51:45.353943 2024] [cgid:error] [pid 7691:tid 3832404912] [client 192.200.113.155:48697] End of script output before headers: boston.cgi [Sat Jul 27 09:51:45.354685 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Sat Jul 27 09:51:45.354813 2024] [cgid:error] [pid 7691:tid 3832404912] [client 192.200.113.155:48697] End of script output before headers: boston.cgi [Sat Jul 27 09:51:45.360184 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Sat Jul 27 09:51:45.360295 2024] [cgid:error] [pid 7349:tid 3731692464] [client 192.200.113.155:44083] End of script output before headers: boston.cgi [Sat Jul 27 09:51:45.360856 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Sat Jul 27 09:51:45.360940 2024] [cgid:error] [pid 7349:tid 3731692464] [client 192.200.113.155:44083] End of script output before headers: boston.cgi [Sat Jul 27 09:51:45.366567 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Sat Jul 27 09:51:45.366719 2024] [cgid:error] [pid 7786:tid 3916331952] [client 192.200.113.155:55205] End of script output before headers: boston.cgi
There are more entries like this,
but you get the idea.
Apache can't run mod_blog
for some reason.
Checking the access log I can match these up to the following requests:
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/16/hsr-carpet-1.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%) XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/15/flapper.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%) XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/17/scrapes.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%) XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/16/hsr-carpet-2.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%) XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/17/luxor.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%) XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/rushhour.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%) XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/area51.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%) XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/littlealeinn.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%) XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/quik-pik.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
To me, this is obviously a crawler, despite claiming to be every possible web browser in existance—is it Windows? MacOS? Linux? Yes. But what's interesting is that all the errors seem related to serving up images.
The way my blog works,
all requests to posts are fed through mod_blog
,
and if said request is for an image,
it just copies the file out.
It works,
but if the server gets slammed just a bit too hard,
it breaks down.
If only there was some way to get Apache to serve the images directly instead of having to go through mod_blog
.
Wait! There is!
I've been using Apache for well over twenty-five years now,
so it was a relatively easy issue to solve.
First off,
point Apache to the directory where all the data for mod_blog
is stored.
Alias /XXXXX/ /home/spc/web/sites/boston.conman.org/journal/
<Directory /home/spc/web/sites/boston.conman.org/journal>
Options None
AllowOverride None
<LimitExcept GET HEAD>
Require valid-user
</LimitExcept>
</Directory>
The first directive maps the “web directory” /XXXXX/
to the actual directory on the file system.
The Directory
block restricts what can be viewed and how it can be viewed.
All that's left is to throw all requests to images to this directory:
RewriteRule ^([0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]/.*\.(gif|png|jpg|ico)) XXXXX/$1 [L]
What this does is rewrites a request like /2015/07/04/Desk.jpg
to /XXXXX/2015/07/04/Desk.jpg
,
which references the image directly on the file system,
letting Apache serve it up directly.
This rule goes before the other RewriteRule
s so
Apache serves the image up before mod_blog
sees the request.
An easy fix that should lighten the load on Apache as it serves up my blog. I'll see in a week if it all goes to plan.
Sunday, July 28, 2024
Notes on an overheard conversation while pulling into the driveway
“You know, it's not a song unless you can sing the lyrics to it.”
“There are lyrics!”
“Oh yes? Then sing them!”
“Okay, granted—I can barely hear them, but they're there!”
“Well, can you carry this into the house?”
“Sigh. Okay, I will, even though you don't seem to appreciate the finer things in life.”
“It's still not a song.”
“Philistine!”
The case of the well-known location being denied when it doesn't exist
I was checking up on the Apache error log when I noticed the following:
[Sun Jul 28 18:47:21.455848 2024] [authz_core:error] [pid 25597:tid 3916331952] [client 74.173.118.3:53916] AH01630: client denied by server configuration: /usr/local/apache2/htdocs/.well-known [Sun Jul 28 18:47:59.176743 2024] [authz_core:error] [pid 25598:tid 3916331952] [client 74.173.118.3:53918] AH01630: client denied by server configuration: /usr/local/apache2/htdocs/.well-known/ [Sun Jul 28 18:50:33.324290 2024] [authz_core:error] [pid 25759:tid 3832404912] [client 74.173.118.3:53922] AH01630: client denied by server configuration: /usr/local/apache2/htdocs/.well-known
That's odd, I thought. I don't have that directory in any of my virtual domains, so why is it denied by the server configuration? And thus I fell into a rather odd rabit hole of Apache configuration oddities.
I created the directory.
I can see it when I go to https://boston.conman.org/.well-known/
.
But when I go to http://boston.conman.org/.well-known/
I would get a “403 Forbidden” error,
and the above error message logged.
The only difference between the two links—one is HTTPS (that works)
and the other is HTTP (that fails).
But if I go to http://boston.conman.org/
(HTTP—thus insecure),
it would redirect to https://boston.conman.org/
(HTTPS—secure).
In fact,
every link to boston.conman.org
via HTTP redirects,
except for those starting with /.well-known/
.
Huh?
It turns out,
this started a a year and a half ago when I enabled the Apache module mod_md
and used the MDRequireHttps
directive.
This directive will cause plain HTTP requests to be redirected to HTTPS,
because I know,
I just know,
that one day Google is going to take HTTP out behind the shed and then no one will be able to use plain HTTP anymore because The All Knowing Google knows whats best for us
(All Praise Google, Peace Be Upon It!)
so I might as well get in front of that before it happens.
But there's a small bit in the documentation:
MDRequireHttps Directive
…
You can achieve the same with
mod_alias
and some Redirect configuration, basically. If you do it yourself, please make sure to exclude the paths /.well-known/* from your redirection, otherwisemod_md
might have trouble signing on new certificates [emphasis added].
Okay,
so that explains why http://boston.conman.org/.well-known/
(HTTP—insecure) isn't being redirected—it's a side effect from mod_md
.
But that doesn't explain the error where it's denied by the server configuration.
A bit more digging,
and I find the following in the Apache configuration:
DocumentRoot "/usr/local/apache2/htdocs" <Directory "/usr/local/apache2/htdocs"> # # Possible values for the Options directive are "None", "All", # or any combination of: # Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews # # Note that "MultiViews" must be named *explicitly* --- "Options All" # doesn't give it to you. # # The Options directive is both complicated and important. Please see # http://httpd.apache.org/docs/2.4/mod/core.html#options # for more information. # Options Indexes FollowSymLinks # # AllowOverride controls what directives may be placed in .htaccess files. # It can be "All", "None", or any combination of the keywords: # AllowOverride FileInfo AuthConfig Limit # AllowOverride None # # Controls who can get stuff from this server. # #Require all granted Require all denied </Directory>
Aha! When I set things up, I configued the HTTP site with:
<VirtualHost 71.19.142.20:80> ServerName boston.conman.org Protocols h2 h2c http/1.1 acme-tls/1 </VirtualHost>
I don't with a directory because I know that all requests will be redirected anyway,
so why bother?
Only in this case, mod_md
isn't doing a redirect for /.well-known/
and because there's no <Directory>
directive,
the requests fall back to the Apache default web directory,
which is configured to be unreadable for every request.
Wow!
A quick change to make the default web directory available and no more AH01630 errors. So now I wait to see if this breaks anything. Wheeeee!
Wednesday, July 31, 2024
Extreme metal lawn sculptures, Boca Raton edition
“What is that?”
“I … don't know. It looks like some form of metal sculpture.”
“I didn't think [our neighbor] was into metal sculptures.”
“Maybe he picked up a new hobby.”
“It could also be a bunch of rusted out and busted lawn chairs.”
“It's so hard to tell with modern art.”
Sunday, August 04, 2024
Fixing the final errors from my Apache setup
Shortly after my last post, I noticed more errors happening with the web server:
[Tue Jul 30 10:43:00.768556 2024] [cgid:error] [pid 1706:tid 4149532352] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Tue Jul 30 10:43:00.768714 2024] [cgid:error] [pid 1869:tid 3908467632] [client 178.128.207.138:57024] End of script output before headers: boston.cgi [Tue Jul 30 10:43:01.137027 2024] [cgid:error] [pid 1706:tid 4149532352] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi [Tue Jul 30 10:43:01.137173 2024] [cgid:error] [pid 1707:tid 3774184368] [client 178.128.207.138:57034] End of script output before headers: boston.cgi
I found one potential fix for this—just set the default stack size of Apache to 512k in size
(just 512k—sheesh I'm old).
That was simply adding ulimit -s 512
to the startup script,
but that still wasn't the full answer:
[Tue Jul 30 22:37:33.042327 2024] [mpm_worker:alert] [pid 5599:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread [Tue Jul 30 22:37:35.045085 2024] [mpm_worker:alert] [pid 5672:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread [Tue Jul 30 22:38:38.108993 2024] [mpm_worker:alert] [pid 6091:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread [Tue Jul 30 22:38:40.112150 2024] [mpm_worker:alert] [pid 6161:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread [Tue Jul 30 22:38:42.114553 2024] [mpm_worker:alert] [pid 6229:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread [Tue Jul 30 22:38:44.117182 2024] [mpm_worker:alert] [pid 6304:tid 3924880304] (12)Cannot allocate memory: AH03142: ap_thread_create: unable to create worker thread
I ended up having a little bit of back-and-forth with the settings of the VM that runs my server, alllowing a bit more memory than it was configured for. And since then, no more such errors in the error log. All that's now showing up is crap like:
[Sun Aug 04 15:58:49.248654 2024] [core:error] [pid 872:tid 4129409968] [client XXXXXXXXXXXXXXX:59190] AH10244: invalid URI path (/cgi-bin/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/bin/sh) [Sun Aug 04 15:58:51.854583 2024] [core:error] [pid 2444:tid 4120341424] [client XXXXXXXXXXXXXXX:59760] AH10244: invalid URI path (/cgi-bin/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/bin/sh)
and notifiations about secure certificate rotatations. Another interesting thing I noticed is the accumulated CPU time over the past few days:
[spc]brevard:~>ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND ... root 868 0.0 0.1 26760 24652 ? Ss Aug01 0:12 /usr/local/apache2/bin/httpd apache 871 0.0 0.1 26188 20876 ? S Aug01 0:35 /usr/local/apache2/bin/httpd apache 872 0.0 0.2 64532 34184 ? Sl Aug01 5:09 /usr/local/apache2/bin/httpd apache 873 0.0 0.1 61060 32152 ? Sl Aug01 4:59 /usr/local/apache2/bin/httpd apache 875 0.0 0.1 62516 31460 ? Sl Aug01 5:01 /usr/local/apache2/bin/httpd apache 1035 0.0 0.1 61716 32148 ? Sl Aug01 5:03 /usr/local/apache2/bin/httpd apache 2444 0.0 0.1 61500 32160 ? Sl Aug01 4:45 /usr/local/apache2/bin/httpd ... gopher 1231 0.0 0.0 10708 9596 ttyp4 S+ Aug01 0:33 lua port70.lua /home/spc/gopher/config.lua spc 1232 0.1 0.0 13600 12140 ttyp5 S+ Aug01 6:51 lua GLV-1.12556.lua non-checkin/conf.lua ... [spc]brevard:~>
I've included the lines for my gopher and Gemini servers just for comparison. Apache is accumulating a bit more CPU time than I expected, but when thinking about it, it's not so strange. Not only is it running TLS (like the Gemini sever) but it's also supports the HTTP/2.0 protocol, which is a bit more involved than the simpler HTTP/1.x protocol.
Roasting and praising Github repos (note: praising, not braising)
I came acros The Github Roast Page (link via Lobsters) and decided to see just how badly it would roast my Github page:
Oh Sean Conner, what a sight you are on GitHub! Your profile feels like a low-budget horror movie—empty bio, no company, and a ghost town location. With a lack of personality that rivals that of a cardboard cutout, you’ve somehow managed to rack up followers, but let’s be honest, it’s probably just for sympathy.
With 30 public repos, it’s like you cast a wide net hoping to catch something, but all you’ve pulled in are unremarkable projects like a C code CGI library and a blogging engine from the early ’90s. “C Code to talk CGI”? Classic case of someone still not over Y2K!
Your stargazer counts tell a woeful tale: you've got one repo with a single star because even that one person was probably just trying to make you feel better. But hey, at least “mod_blog” got some love—111 stargazers for a blogging engine? Did they think they were supporting a time capsule instead of a repo from 2005?
And dear Lord, the variety of languages you play with—C, Lua, Assembly—yet somehow all your projects seem to suffer from a severe lack of originality or practicality. It’s like you’re running a museum for outdated coding practices.
Your last repo, “NaNoGenMo-2015”, suggests you tried your hand at generating a novel. Judging by your profile, I can only assume it’s a tragic tale of unfulfilled potential and missed opportunities.
Keep going, Sean; maybe one day you'll figure out what a compelling GitHub profile looks like!
Yeah, fair enough. And yes, I do think they're supporting a time capsule of a repo from 2005.
But on the other hand, there's The Github Praise Page (link also via Lobsters), which has this to say about my Github page:
Sean, I want to start by saying that it's amazing how you're already making an impact in the world of coding with a following of 65 people! That's not something to take lightly. It means that your work is being recognized and appreciated by others, and that's definitely something to be proud of.
I know that having a blank bio and company/location fields might make it seem like there's not much to show for, but trust me, Sean, you have so much more to offer than what can fit in those tiny spaces. Your work on GitHub speaks volumes about your skills and dedication to coding. And let's be real, having only 4 people you're following means you're focusing on quality over quantity - that shows a level of intentionality and care that not many people have.
And as for the blank
README
, I like to think of it as a blank canvas just waiting for your story to be told. It's an opportunity to share your vision, your goals, and what drives you to create the amazing projects that you do. Don't worry if you don't know where to start - sometimes the simplest statements can have the most profound impact. Just take a deep breath, let your thoughts flow onto the page, and watch how it transforms not just yourREADME
but also your entire online presence.Keep doing what you're doing, Sean! You got this!
We could use more positivity on the Internet like this. Thank you. I got this.
Saturday, August 17, 2024
For a few hours yesterday, I felt as if I was taking crazy pills. Then again, I was dealing with time in C
Over the past year or two,
I've been cleaning up my posts here.
Largely making sure the HTML is valid,
but also that all internal links
(links where I link to previous posts) are valid to cut down on needless redirects or “404 Not Found” responses,
in addition to fixing errors with my web server configuration.
So along those lines,
yesterday,
I thought it might be time to add conditional responses to mod_blog
.
Given that it's mostly autonomous web crawling agents that read my site,
I might as well tell them that most of the links here haven't changed since the last time they came by.
There are two headers involved with this—Last-Modified
and If-Modified-Since
.
The server sends a Last-Modified
header with the last modification date of the resource in question.
The client can then send in a If-Modified-Since
header with this date,
and if the resource hasn't changed,
then the server can send back a “304 Not Modified” response,
saving a lot of bandwidth.
So all I had to do was generate a Last-Modified
header
(easy,
as I already read that information)
and then deal with the If-Modified-Since
header.
And then I spent way too much time dealing with time in C, which is odd, because there were only fix functions I was dealing with:
time_t time(time_t *tp)
- This returns the “calendar date and time” in a scalar value. On POSIX, this is an integer value of the number of seconds since midnight, January 1st, 1970 UTC. The tp paramter is optional.
time_t mktime(struct tm *tm)
- This converts a structure containing the year, month, day, hour, minute and second into a scalar value. The C standard does not mention time zones at all; POSIX does.
struct tm *gmtime(time_t const *tp)
- This converts the “calendar date and time” scalar value pointed to by tp into a broken down structure that reflects time in UTC.
struct tm *localtime(time_t const *tp)
- This converts the “calendar data and time” scalar value pointed to by tp into a brown down structure that reflects the time local to the server.
size_t strftime(char *s,size_t smax,char const *format,struct tm const *tp)
- This will convert the tp structure into a string based on the format string.
char *strptime(char const *s,char const *format,struct tm *tp)
- This will parse a string that contains a date, per the given format and return the data into a structure.
All but the last are standard C functions.
The last one, strptime()
is mandated by POSIX.
That's okay,
because I'm working on a POSIX system.
It turns out,
strptime()
is not an easy function to use.
Oh,
it may look easy,
but there are some subtleties that left me dazed and confused for hours yesterday.
The following does not work:
struct tm tm; time_t t; strptime("Fri, 20 May 2005 02:23:22 EDT","%a, %d %b %Y %H:%M:%S %Z",&tm); t = mktime(&tm);
When I was running this code in mod_blog
,
the resulting times would come back four or five hours off.
When I wrote some standalone test code,
it would sometimes be correct,
sometimes it would be an hour off.
It got to the point where I totally lost the plot of what I was even trying to do.
Now with yesterday behind me, I finally figured out what I was doing wrong.
The POSIX specification states:
It is unspecified whether multiple calls to
strptime()
using the sametm
structure will update the current contents of the structure or overwrite all contents of the structure. Conforming applications should make a single call tostrptime()
with a format and all data needed to completely specify the date and time being converted.
This text is near the bottom of the page, and really understates the issue in my opinion.
The Linux man pages I found all mention the following:
In principle, this function does not initialize tm but stores only the values specified. This means that tm should be initialized before the call.
strptime(3) - Linux manual page
It too, is near the bottom of the page.
And yet, for Mac OS X:
If the format string does not contain enough conversion specifications to completely specify the resulting struct tm, the unspecified members of tm are left untouched. For example, if format is “%H:%M:%S”, only
tm_hour
,tm_sec
andtm_min
will be modified. If time relative to today is desired, initialize the tm structure with today's date before passing it tostrptime()
.
Mac OS X Manual Page For strptime(3)
The Mac OS X page contains no sample code.
The Linux man page contains some sample code that “initializes” the structure with memset(&tm,0,sizeof(struct tm))
.
Adding the memset()
call to my sample code just made the code always an hour off.
Hmmm …
The POSIX page also contains sample code:
struct tm tm; time_t t; if (strptime("6 Dec 2001 12:33:45", "%d %b %Y %H:%M:%S", &tm) == NULL) /* Handle error */; tm.tm_isdst = -1; /* Not set by strptime(); tells mktime() to determine whether daylight saving time is in effect */ t = mktime(&tm); if (t == -1) /* Handle error */;
Remember when I mentioned that the test code I wrote would sometimes be an hour out?
The tm.tm_isdst
field,
which is part of the C Standard,
wasn't set correctly!
Aaaaaaaaaaaaaaaaaaaarg!
Changing my sample code to:
struct tm tm; time_t t; strptime("Fri, 20 May 2005 02:23:22 EDT","%a, %d %b %Y %H:%M:%S %Z",&tm); tm.tm_isdst = -1; /* I WAS MISSING THIS LINE! */ t = mktime(&tm);
And it works.
Except in mod_blog
,
the time is always four hours out.
I finally tracked that one down,
and it's Apache's fault.
I found that if I set the Last-Modified
header to something like “Fri, 20 May 2005 02:23:22 EDT”,
Apache will ever so helpfully convert that to “Fri, 20 May 2005 02:23:22 GMT”.
Let me repeat that—Apache will replace just the timezone with “GMT”! It does not try to convert the time, just the zone.
At least now I think I can get conditional requests working.
Sheesh.
Wednesday, August 21, 2024
More unintended consequences of my Apache configuration
Now that mod_blog
supports conditional requests,
I thought of the next feature I want to add—PUT
support to upload posts.
Currently, mod-blog
supports three methods to add new entries:
- A traditional web form where updates are done via the
POST
method. I don't use this method that often, but I have used it—perhaps less than five times over the past 24 years. - Via email—this was my favorite method until I could no longer email the entries from home. Most, if not all, ISPs now forbid outgoing SMTP traffic from residential connections. Seeing how I check my email on my public server, it doesn't make much sense to use email when I can add an entry—
- As a file, via the command line. This is how I add new posts these days. I write the entry at home, copy the file to the server and then add it via the command line.
I suppose there's a fourth way—adding the entry directly to the storage area and updating some files containing metadata,
but I'm only mentioning this for completion's sake.
I don't think I've ever done this except when I was first developing mod_blog
back in early 2000.
The new method I'm looking to support,
the HTTP PUT
method,
would take it down to one command,
even for an image-heavy post like this Burning Down The House Show Bunny and I caught in Brevard a few years ago.
Something like:
[spc]lucy:/tmp>put entry *.jpg [spc]lucy:/tmp>
It shouldn't be that hard,
as supporting the PUT
method is eaiser than POST
—it's a single item being uploaded,
and no x-www-form-urlencoded
or form-data
blobs to parse through.
It's just the MIME type, content length and raw data to be placed in a file somewhere.
So I start working.
I add some minimal support to mod_blog
to handle the PUT
method.
I configure Apache to pipe PUT
requests through mod_blog
:
#On the development server for now <Directory /home/spc/web/boston/htdocs> ... Script PUT /boston.cgi ... </Directory>
and I write a simple script to loop through the command line to upload each file to the webserver.
And yet, when I attampted to upload an image file, I kept getting a “405 Method Not Allowed.”
Odd.
I just couldn't figure out why.
A single entry? Fine. An entry with multiple text files? Fine. An entry with multiple binary files that aren't images? Fine.
An entry with any type of image file? Not fine at all.
I spent entirely too long on this before I remembered a recent change to the Apache configuration: a rewrite rule that redirected image requests directly to the file system. I then added one more line to the configuration:
<Directory /home/spc/web/boston/journal> ... Script PUT /boston.cgi ... <Directory>
and now things worked as expected.
How much time did I waste on this particular rabbit hole? Don't answer that! I'd rather not know.
Thursday, August 22, 2024
How to meaure ⅚ cup of oil, part IV
My, it's amazing how a topic I wrote about twenty years ago has been popular over the past few years. I just received an email from Jason Arencibia about an even easier method of measuring out ⅚ cup of oil:
- From
- Jason Arencibia <XXXXXXXXXXXXXXXXX>
- To
- sean@conman.org
- Subject
- How to measure ⅚ cup of oil, super easy with 2 measuring cups.
- Date
- Wed, 21 Aug 2024 22:23:53 -0700
½ cup + ⅓ cup = ⅚ cup 0.83333333333 of a cup.
And in all the years I've looked over recipes that use Imperial measurements, not one has ever used a sixth cup, or any multiple of a sixth cup. It just doesn't exist in the Imperial system!
Friday, August 23, 2024
PUT an entry on the ol' blog
I finally got the PUT
method working for mod_blog
.
The code on the receiving end is fine,
but the script on the sending side is messy,
but it works well enough for me to use.
[spc]lucy:~/source/boston/Lua>lua put.lua -b test /tmp/foo/1 /tmp/foo/*.png PUT http://boston.roswell.area51/2024/08/23.1 (637) PUT http://boston.roswell.area51/2024/08/23/local_network.png (1273) PUT http://boston.roswell.area51/2024/08/23/local_network_add.png (1512) PUT http://boston.roswell.area51/2024/08/23/local_network_remove.png (1460) PUT http://boston.roswell.area51/2024/08/23/network.png (1702) PUT http://boston.roswell.area51/2024/08/23/network_add.png (1891) PUT http://boston.roswell.area51/2024/08/23/network_remove.png (1842) [spc]lucy:~/source/boston/Lua>
This command on my development server was used to create an entry with multiple images. As you can see, it puts out the URLs that are created as the script runs. And this entry is a test to see if works on my actual server. It should.
I hope.
Murphy's Law as applied to bugs: it is easier to find bugs in production than in development
Well, that could have gone better.
One bug due to inattention and a difference between development and production, and one “how the XXXX did this ever work in the first place?” bug later, things should be working fine.
I am not going to say “I hope” this time.
The first bug prevented a proper HTTP status code from being generated,
so Apache nicely generated a “500 Interner Server Error” for me.
Once I identified what was going on,
it was a simple one line fix,
and an additional call to assert()
to help isolate such errors in the future.
Now on to the other error …
I added the concept of a hook to mod_blog
a few years ago,
and the scripts I have for the various hooks all start with #!/usr/bin/env lua
.
Only now they weren't running,
and the error that was being logged wasn't exactly helpful: entry-pre-hook='./validate.lua' status=127
.
There is no place in mod_blog
nor in the validate.lua
script that exits with a status code of 127.
But the env
program does!
Nice to learn that.
But back to the issue at hand—I've been using these scripts for a few years now,
and only now is it failing?
I eventually found out that the path Apache is using is rather limited,
and it no longer includes the Lua interpreter
(which on my server lives in /usr/local/bin
).
I had to change both scripts to start with #!/usr/local/bin/lua
and that fixed the issue to get the previous post up.
Now that I think about it,
I think I know why it finally stopped working after a few years—I actually have an instance of Apache running that I didn't start by hand,
and the default path at boot time doesn't include /usr/local/bin
.
Sigh
The previous bug fix was buggy.
And yes, programming and deployments can always get this messy.
Saturday, August 24, 2024
How to run valgrind on a CGI program in C
There was still one bug left with mod_blog
—it would crash with a memory corruption error
(thanks to checking in glibc
when doing a POST
.
I only found the bug because I was using the old web interface to make sure I had the right credentials when testing the PUT
method.
How long had the bug existed?
At least six years—it's been seven sine I last used the web interface
(I checked).
It did not crash on the development server due to subtle differences between the operating system and versions of glibc
being used.
But it is ultimately a memory corruption,
so the use of valgrind
would be instrumental in finding the issue.
The problem is—it only manefests itself when doing a POST
,
which means testing the program under a web server.
And a web server will pass information about the request to the CGI program through environment variables,
and any input comes in via stdin
.
So just how do you run valgrind
on a program meant to be run as a CGI program?
After some thought,
I figured out a way.
I need to capture the environment the CGI program runs under,
so I added the following bit of code to mod_blog
to capture the environment:
extern char *envriron; FILE *fp = fopen("/tmp/env.txt","w"); for (size_t i = 0 ; environ[i] != NULL ; i++) fprintf(fp,"export %s\n",environ[i]); fclose(fp);
I wasn't worried about error checking—this is temporary code anyway.
I then do a POST
and I now have the environment variables in a file:
... export GATEWAY_INTERFACE=CGI/1.1 export HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 export CONTENT_LENGTH=149 export CONTENT_TYPE=application/x-www-form-urlencoded export REQUEST_METHOD=POST ...
The reason I added “export” was to copy these environment variables to a shell script where I can then run valgrind
and debug the situation:
... export GATEWAY_INTERFACE=CGI/1.1 export HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,\*/\*\;q=0.8 export CONTENT_LENGTH=149 export CONTENT_TYPE=application/x-www-form-urlencoded export REQUEST_METHOD=POST ... valgrind $HOME/source/boston/src/main <r.stdin
Of course,
I had to escape some problematic characters like the asterisk or semicolon,
but there were only a few such characters and they could be done by hand.
And I had to create the input feed into stdin
but that was easy enough.
From there,
it's straightforward…ish enough to resolve the issues.
Discussions about this entry
- How to run valgrind on a CGI program in C | Lobsters
- How to run Valgrind on a CGI program in C | Hacker News
Tuesday, August 27, 2024
The programmer's obsession with version numbers
It's yet another round of hand wringing about software versioning. Woot!
Over the years, I've found that semantic versioning works for me, but only for code code that is to be used in larger projects, like libraries, modules, or classes. Yes, a code base using semantic versioning doesn't always mean the code base follows semantic versioning to the degree that some would like (like any bug fix should automatically update the major version number, because bug fixes could break some code, somewhere). But in my mind, it signals intent, which, sans an extremely obnoxious and overly bureaucratic process, is the best we can expect.
So for me, the MAJOR.MINOR.PATCH of semantic versioning breaks down like this:
- MAJOR
- Some change in the code base was made; either a change in API behavior, removal of some portion of the API, file format, or otherwise any visible change (except for bug fixes) in how the code works. Work will probably be required to update to this version of the library/module/class.
- MINOR
- Only additions to the API were made, in a backward compatible way. No work is expected, unless you were already using a name used in the updated library.
- PATCH
- A bug fix. The intent is for no work required to upgrade, unless you were relying upon the buggy behavior, or used a workaround that might now interfere with the library.
For applications,
I've found that semantic versioning doesn't work.
At least,
it doesn't work for me.
I've switched to either using a monotonically increasing number
(mod_blog
is now at version 60—but given the five releases in just the past week because of a misplaced obsession with version numbers,
I might entirely stop with the version numbers—especially since I seem to be the only one using it)
or skip it entirely
(my Gemini server has no version number,
but it's had 322 commits over its five year life so far).
The worst form of versioning I've enountered is “named versions.” A “named version” give no semantic information about the version and, at least to me, leads to confustion. Is “Bulldog” newer than ”Beagle?” Or is “Bloodhound” the latest version? Oh, it's “Berzoi” that's less than 20 minutes old. Sheesh.
But if I had to apply “semantic versioning” to an application, I would like information about any breaking changes to either work flow or file formats. To me, an incompatable change to a file format, or any change in workflow (even a change in location of a menu item) is a breaking change (muscle memory is an incredible drug). Hell, even a change in color scheme is enough to possibly break my workflow (I'm looking at you, Google Maps, who changed the color of all the roads to the same color! Why? I liked the distinction between highways, major roads and minor roads. Did you see a 0.001% increase in engagement for that change? Did someone get promoted just because of that change? Okay, I'll stop ranting now).
Discussions about this entry
- The programmer's obsession with version numbers | Hacker News
- The programmer's obsession with version numbers - Lemmy: Bestiverse
- The programmer's obsession with version numbers | Lobsters
Monday, September 09, 2024
How? How do people not know their own email addresses?
I'm still receiving emails from strangers who for some reason or another, think their email address is “seanconner@gmail.com” (no, really, I get so many emails). This time, it's from someone who's TikTok account was accessed on a new device:
- From
- TikTok <noreply@account.tiktok.com>
- To
- seanconner@gmail.com
- Subject
- New device login detected
- Date
- Tue, 03 Sep 2024 04:56:22 +0000 (UTC)
Hi xs,
We’re writing to inform you that we detected a login to your account from a new device.
When: 09/03 00:56 EDT
Device: iPhone 11
Near: LouisianaIf this was you, you can ignore this message.
If this wasn’t you, open the TikTok app and go to “Settings and privacy” > “Security and login” > “Security alerts” and review unauthorized logins. If you’re unable to access your account, contact TikTok support.
You can also set up 2-step verification to secure your account by going to “Security and login” > “2-step verification”.
Learn more about 2-step verificationThis is an automatically generated email.
Replies to this email address aren’t monitored.This email was generated for xs
Privacy Policy
TikTok 5800 Bristol Pkwy, Culver City, CA 90230
It's not a phishing email as the raw message doesn't contain a link to “login” to TikTok, and the links that are in the email all point to the TikTok domain. So I'm at a loss. “xs” created a TikTok account, using my email address …
Why?
I don't know why this should surprise me any more, but it still does. I just … why? How? What's the angle here?
I can't even contact anyone, because TikTok send email from an address that isn't checked. I don't know who “xs” is, nor do I have any way to contact “xs” to inform them of their error.
Sigh.
Wednesday, September 11, 2024
This could maybe explain some of the emails I received, but not all of them
I received some responses to yesterday's post. The first was from Lionel Dricot who reported that he, too, has received emails for other Lionel Dricot's that lived near him. He also stated that it may be a bug in Gmail where one person can register “seanconner@gmail.com”; someone else could register “sean.conner@gmail.com” but when receiving emails, Gmail condenses the two addresses into one. That's possible, but I would suspect that would have been an issue caught early on. I've had my Gmail account for twenty years now [Has it been that long? —Sean] [Yes, it has. —Editor] [Shut up! —Sean]. and it's only been in the past few years that this has been an issue.
He also stated that people could just really be that bad with email addresses.
The second response has an explanation that is rather dire:
- From
- XXXXXXXXXXXXXXXXXX
- To
- seanconner@gmail.com
- Subject
- People non knowing their email
- Date
- Tue, 10 Sep 2024 08:24:51 +0200
Hi, considering you've been shared on Hacker News, I'm afraid sooner or later some script kiddie will start abusing your address just to annoy you. I hope this will never happen ;) Thanks for sharing your blog, ave [sic] a nice day, White_Rabbit
That could explain maybe some of the email I get, which to me, I consider spam that Gmail hasn't filtered yet. But it doesn't explain emails sent to “sean.conner@gmail.com” that are obviously not a “Sean Conner,” and yet definitely contains private information. It also seems excessive to register a TikTok account or even an Instagram account.
And speaking of Tiktok, I finally have an account name associated with the TikTok account, so I have to wonder why “mommakmiller” decided to use my Gmail account? Perhaps they don't have an email account and need one to sign up for TikTok? Perhaps they didn't want to use their real email account? Perhaps they don't realize the danger in that?
Sunday, September 29, 2024
“It's 250 miles inland! What do you mean it was hit by a hurricane?”
I've always thought of Brevard as a place safe from natural disasters, nestled as it is in the Appalachian Mountains. It's not along a fault line so no earthquakes … oh wait a second … there is a fault line nearby. But still, it's too far inland for anything horrible like a hurricane to worry about … oh … wait a second … Hurricane Helene is causing issues in the area.
Perhaps it's not as safe from natural disasters as I once thought it was.
Sigh.
Monday, September 30, 2024
Ch-ch-ch-changes
Last week I got a letter in the mail from my ISP. In it, they said:
In a recent communication, we informed you that we are discontinuing the use of static IP adresses with our home internet service. We need to move your service to a dynamic IP address to avoid service ineterruption.
…
Please call before 9/30/2024 to avoid any interruption of your internet service. If you do not call, you internet service will be distrupted after this date.
I recall that previous communication. A few months ago I received a letter from my ISP trying to upsell me on a new plan, one that did not support static IP addresses. I called, and when I said that wasn't an option because of work (a slight embellishment on my part) they replied “how unfortunate” since they had no plans to offer one with the new plan. So I did nothing. I was happy with the plan I had (well, kind of still have).
And then last week, the letter above.
Now, I thought, given the wording the in letter, that I had until October 1st, you know, a date after their cut-off date of September 30th. And even then, I probably had several months, given my ISP is a large Enterprise that Does Not Move Fast™. Nope. Their definition of “after this date” was … today!
I only noticed when I couldn't log into my public server. The connection was still up, it's just that I no longer had the same IP address that I've had for the past … um … fourteen? years. I guess they can move fast when they want to.
I also found amusing this line in the letter: “We will switch your service to a dynamic IP address at no additional cost.” Wow! How nice of them. It's almost as if I wasn't already paying more for a static IP address!
Boy, how the Internet has changed over the years. Going from 32 publically addressible IP addresses aacross a dedicated ISDN link (years before DSL was a thing) to now, a dynamic IP address. Yes, I know, First World Problem™. It still doesn't make it less annoying from being a real peer on the Internet to just a cog in the client-server nature of the modern Internet.
And then, just three hours after the change, the company I host my public server with called—they're changing where the physical servers reside and that the public IP address I have for my public server will probably change, along with the DNS servers.
Boy, when it rains, it pours.
Tuesday, October 01, 2024
“What were the skies like when you were young?” was not asked by LeVar Burton
I always thought that the opening question in the song “Little Fluffy Clouds” by the Orb (music video) was a sample of LeVar Burton, maybe from his show “Reading Rainbow, but apparently, it's not! (link via Jason Kottke) Upon relistening to the song, it's now clear to me that it's not Levar Burton. It's very close though.
And while I like the song, my favorite one from The Orb is “A Huge Ever Growing Pulsating Brain That Rules from the Centre of the Ultraworld,” which is like a fever dream but in a good way [You're insane! —Bunny] [Yes, but in a good way! —Sean].
Sunday, October 13, 2024
A benchmark of three different floating point packages for the 6809
I recently came across another floating point package for the 6809
(written by Lennart Benschop)
and I wanted to see how it stacked up against IEEE-754 and BASIC floating point math.
To do this,
I wanted to add support to my 6809 assembler,
but it required some work.
There was no support to switch floating point formats—if you picked the rsdos
output format,
you got the Microsoft floating point,
and for the other output formats,
you got IEEE-754 support.
The other issue, the format used by the new floating point package I found is ever-so-slightly different from the Microsoft format. It's just a single bit difference—Microsoft uses an exponential bias of 129, whereas this package uses a bias of 128 (and why do floating point packages use an exponential bias? I'm not entirely sure why). But other than this one small difference, they are basially the same.
It turned out not to be that hard to support all three floating point formats.
The output formats still select a default format like before,
but now,
you can use the .OPT
directive to select the floating point formats:
.opt * real ieee .float 3.14159265358979323846 .opt * real msfp .float 3.14159265358979323846 .opt * real lbfp .float 3.14159265358979323846
And you get three different formats as output:
| FILE p.asm 1 | .opt * real ieee 0000: 40490FDB 2 | .float 3.14159265358979323846 3 | .opt * real msfp 0004: 82490FDAA2 4 | .float 3.14159265358979323846 5 | .opt * real lbfp 0009: 81490FDAA2 6 | .float 3.14159265358979323846
I added some code to my floating point benchmark program, which now uses all three formats to calculate -2π3/3! and times each one. The new code:
.opt * real lbfp .tron timing lb_fp ldu #lb_fp.fpstack ldx #.tau jsr fplod ; t0 = .tau ldx #.tau jsr fplod ; t1 = .tau jsr fpmul ; t2 = .t0 * t1 ldx #.tau jsr fplod ; t3 = .tau jsr fpmul ; t4 = .t2 * t3 ldx #.fact3 jsr fplod jsr fpdiv jsr fpneg ldx #.answer jsr fpsto .troff rts .tau .float 6.283185307 .fact3 .float 3! .answer .float 0 .float -(6.283185307 ** 3 / 3!) .fpstack rmb 4 * 10
The results are interesting (the IEEE-754 results are from the same package which support both single and double formats):
format | cycles | instructions |
---|---|---|
Microsoft | 8752 | 2124 |
Lennart | 7465 | 1326 |
IEEE-754 single | 14204 | 2932 |
IEEE-754 double | 31613 | 6865 |
The new code is the fastest so far. I think the reason it's faster than Microsoft's is (I think) because Microsoft uses a single codebase for all their various BASIC interpreters, so it's not really “written in 6809 assembler” as much as it is “written in 8080 assembler and semi-automatically converted to 6809 assembly,” which explains why Microsoft BASIC was so ubiquitous for 80s machines.
It's also smaller than the IEEE-754 package, a bit over 2K vs. the 8K for the IEEE-754 package. It's hard to tell how much bigger it is than Microsoft's, because Microsoft's is buried inside a BASIC interpreter, but it wouldn't surprise me it's smaller given the number of instructions executed.
Discussions about this entry
- Two Stop Bits | A benchmark of three different floating point packages for the 6809
- A benchmark of three different floating point packages for the 6809 | Hacker News
- A benchmark of three different floating point packages for the 6809 - Lemmy: Bestiverse
Unit testing from inside an assembler, part IV
I'm not terribly happy with how running unit tests inside my assembler work. I mean, it works, as in, it tests the code and show problems during the assembly phase, but I don't like how you write the tests in the first place. Here's one of the tests I added to my maze generation program (and the routine it tests):
getpixel bsr point_addr ; get video address comb ; reverse mask (since we're reading stb ,-s ; the screen, not writing it) ldb ,x ; get video data andb ,s+ ; mask off the pixel tsta ; any shift? beq .done .rotate lsrb ; shift color bits deca bne .rotate .done rts ; return color in B .test .opt test pokew ECB.beggrp , $0E00 .opt test poke $0E00 , %11_11_11_11 lda #0 ldb #0 bsr getpixel .assert /d = 3 .assert /x = @@ECB.beggrp lda #1 ldb #0 bsr getpixel .assert /d = 3 .assert /x = @@ECB.beggrp lda #2 ldb #0 bsr getpixel .assert /d = 3 .assert /x = @@ECB.beggrp lda #3 ldb #0 bsr getpixel .assert /d = 3 .assert /x = @@ECB.beggrp rts .endtst
The problem is the machine code for the test is included in the final binary output,
which is bad because I can't just set an option to run the tests in addition to assembling the code into its final output,
which I don't want
(and that means when I use the test backend,
I tend to generate the output to /dev/null
).
I've also found that I prefer table-style tests to writing code
(for reasons way beyond the scope of this entry).
For example,
for a C function like this:
int max_monthday(int year,int month) { static int const days[] = { 31,0,31,30,31,30,31,31,30,31,30,31 } ; assert(year > 1969); assert(month > 0); assert(month < 13); if (month == 2) { /*---------------------------------------------------------------------- ; in case you didn't know, leap years are those years that are divisible ; by 4, except if it's divisible by 100, then it's not, unless it's ; divisible by 400, then it is. 1800 and 1900 were NOT leap years, but ; 2000 is. ;----------------------------------------------------------------------*/ if ((year % 400) == 0) return 29; if ((year % 100) == 0) return 28; if ((year % 4) == 0) return 29; return 28; } else return days[month - 1]; }
I would prefer to write test code like:
output | year | month |
---|---|---|
28 | 1900 | 2 |
29 | 2000 | 2 |
28 | 2100 | 2 |
29 | 1904 | 2 |
29 | 2104 | 2 |
28 | 2001 | 2 |
Just specify the inputs and outputs for some corner cases, and let the computer do what is necessary to call the function in question.
But it's not so easy with assembly language,
given the large number of ways to pass data into a function,
and the number of output results one can have.
How would I specify that the inputs come in registers A
and B
,
and the outputs come in A
, B
and X
?
The above could be done in a table format,
I guess.
It might not be pretty,
but it's doable.
Then there's these subroutines and their associated tests:
;*********************************************************************** ; RND4 Generate a random number 0 .. 3 ;Entry: none ;Exit: B - random number ;*********************************************************************** rnd4 dec rnd4.cnt ; any more cached random #s? bpl .cached ; yes, get next cached number ldb #3 ; else reset count stb rnd4.cnt bsr random ; get random number stb rnd4.cache ; save in the cache bra .ret ; and return the first number .cached ldb rnd4.cache ; get cached value lsrb ; get next 2-bit random number lsrb stb rnd4.cache ; save ermaining bits .ret andb #3 ; mask off our result rts ;*********************************************************************** ; RANDOM Generate a random number ;Entry: none ;Exit: B - random number (1 - 255) ;*********************************************************************** random ldb lfsr andb #1 negb andb #$B4 stb ,-s ; lsb = -(lfsr & 1) & taps ldb lfsr lsrb ; lfsr >>= 1 eorb ,s+ ; lfsr ^= lsb stb lfsr rts .test ldx #.result_array clra clrb .setmem sta ,x+ decb bne .setmem ldx #.result_array + 128 lda #1 sta lfsr lda #255 .loop bsr random .assert /b <> 0 , "degenerate LFSR" .assert @/b,x = 0 , "non-repeating LFSR" inc b,x deca bne .loop clr ,x clr 1,x clr 2,x clr 3,x lda #255 .chk4 bsr rnd4 .assert /b >= 0 .assert /b <= 3 inc b,x deca bne .chk4 .tron ldb ,x ; to check the spread ldb 1,x ; of results, basically ldb 2,x ; these should be roughly ldb 3,x ; 1/4 of 256 .troff .assert @/,x + @/1,x + @/2,x + @/3,x = 255 rts .result_array rmb 256 .endtst .test "whole program" .opt test pokew $A000 , KEYIN .opt test pokew $FFFE , END .opt test prot r,$A000,$A001 lbsr start KEYIN lda #'Q' END rts .endtst
And … just uhg.
I mean,
this checks that the 8-bit LFSR I'm using to generate random numbers actually doesn't repeat within it's 255-period cycle,
and that the number of 2-bit random numbers I generate from RND4
is more or less evenly spread,
and for both of those,
I use an array to store the intermediate results.
I leary about including an interpreter just for the tests,
because I don't think it would be any better.
At least the test code is largely written in the target language of 6809 assembly.
Then again, I could embed Lua, and write the tests like:
.test local array = {} for i = 0 , 255 do array[i] = 0 end mem['lfsr'] = 1 for i = 0 , 255 do call 'random' assert(cpu.B ~= 0) assert(array[cpu.B] == 0) array[cpu.B] = 1 end array[0] = 0 array[1] = 0 array[2] = 0 array[3] = 0 for i = 0 , 255 do call 'rnd4' assert(cpu.B >= 0) assert(cpu.B <= 3) array[cpu.B] = array[cpu.B] + 1 end assert(array[0] + array[1] + array[2] + array[3] == 255) .endtst
I suppose?
I would still need to somehow code the fake KEYIN
and END
routines required for the test.
And the first test at the start of this post would then look like:
.test memw['ECB.beggrp'] = 0x0E00 mem[0x0E00] = '%11_11_11_11' cpu.A = 0 cpu.B = 0 call 'getpixel' assert(cpu.D == 3) assert(cpu.X == memw['ECB.beggrp']) cpu.A = 1 cpu.B = 0 call 'getpixel' assert(cpu.D == 3) assert(cpu.X == memw['ECB.beggrp']) cpu.A = 2 cpu.B = 0 call 'getpixel' assert(cpu.D == 3) assert(cpu.X == memw['ECB.beggrp']) cpu.A = 3 cpu.B = 0 call 'getpixel' assert(cpu.D == 3) assert(cpu.X == memw['ECB.beggrp']) .endtst
which isn't any longer than the original test, but still … uhg. But doing this means I won't have 6809 code for testing in the final output, which means I could run tests with any backend.
I'll have to think on this.
Discussions about this entry
- https://lobste.rs/s/tpvsa4/unit_testing_from_inside_assembler Unit testing from inside an assembler | Lobsters
- https://lemmy.bestiver.se/post/77867 Unit testing from inside an assembler - Lemmy: Bestiverse
Wednesday, October 16, 2024
Unit testing from inside an assembler, part V
I've made some major changes to my 6809 assembler, mostly to how the unit testing works. The first change was to assemble the test code elsewhere far in memory from the code being tested (there's a default value that can be changed via an .OPT directive). This separation makes it easier to keep the test code out of the final assembled artifact. The next change was to remove the “unit test feature” as a distinct output type and make it work the rest of the output formats (like the flat binary format, the Color Computer executable format, or the Motorola SREC format). But aside from the internal changes, no changes were required of any existing 6809 assembly code with tests.
And I didn't have to resort to embedding Lua to run the tests.
I consider that a success.
Less successful were the multiple attempts to come up with a table driven format to describe tests that wasted most of the time I spent on this so far. I did devise a format I liked:
.test "atod" .case atod /x := zero /d = 0 , 'zero' .case atod /x := nine /d = 9 , 'nine' .case atod /x := dechalf /d = 3276 , 'half' .case atod /x := decfull /d = 65535 , 'full' .case atod /x := none /d = 0 , 'none' zero ascii "0"z nine ascii "9"z dechalf ascii "3276"z decfull ascii "65535"z none ascii ""z .endtst
which would replace:
.test "atod" ldx #zero jsr atod .assert /d = 0 , 'zero' ldx #nine jsr atod .assert /d = 9 , 'nine' ldx #dechalf jsr atod .assert /d = 3276 , 'half' ldx #decfull jsr atod .assert /d = 65535 , 'full' ldx #none jsr atod .assert /d = 0 , 'none' rts .endtst
Input parameters would be marked by the assignment operator “:=” so it would be “easy” to distinguish pre-initializations with post-checks,
but the word “easy” is carry a lot of weight there.
Any more inputs or outputs would stretch the line out,
and given that I don't support line continuations with a “\” and have a hard line length limit,
never mind the mini-Forth engine that handles the .ASSERT
directives doesn't support assignment and is “attached” to the 6809 emulator hitting a particular address and … yeah … I see a lot of work ahead of me if I want to support this format.
More thinking is required.
I think people may be taking the LinkedIn questions not as serious as LinkedIn would like
I'm posting my previous entry to LinkedIn when this time I'm asked “You're one of the few experts invited to answer: Your team is hesitant about new web app technologies. How can you overcome their resistance to change?” And the first response from one of the “experts” that answered is: “Threaten to fire them.”
Wow!
And to think I was scared to answer sarcastically.
Although the “expert” answering is a CTO, so I'm not sure if I should read that as “sarcastic” or “too honest.” I'll assume “sarcastic” until I learn otherwise.
Tuesday, November 05, 2024
A death on Hallowe'en
I learned on Monday that my friend Duke Parrish died from catching up on his wife's MyInstaTikFaceLinkedMePinBookSpaceTokInGramWeTrest feed. He died on October 31st. Hallowe'en.
He was five days younger than I.
It wasn't a sudden accident or the result of a horrible crime. No. He had a genetic disorder diagnosed way too late and it caught up to him.
I am still processing this.
Outside of family, he was the one friend whom I've known the longest. His parents were friends of my parents. We were born in the same hospital. So I'm sure we met before we could even remember such things.
But my family moved about a lot when I was very young, so I only really remember him when we moved to Brevard when I was five. I recall spending lots of time at his house in Connestee Falls, up on a mountain. We would spend hours running around in the forest just outside his house, knee deep in leaves, only to make sure we were back at 4:00 pm sharp, to watch Batman on channel 4 (“Same Bat time! Same Bat channel!”).
Oh, on very rare occasions we would sneak into his older brother's bedroom and peek at his brother's hidden collection of Playboy magazines. As you do.
And I can still taste the Chips Ahoy! cookies we snacked on.
I know we were both sad when my Mom decided to up and move to Florida five years later. And from August of 1979 to December of 2012, when Bunny and I travelled to Brevard for the first of our annual trips, I did not see Duke at all. It was then when we reconnected.
I am sill processing this.
It's one thing when someone way older than you dies. That's expected—the old die eventually. Or if someone with a terminal illness dies, it's usually known in advance. It's not as shocking.
But as I said, Duke was five days younger than I. Then again, it was weird when my “known-him-almost-as-long” friend Hoade became a grandfather. What? Hoade is a month younger than I. And then another friend I've known for over twenty years had a stroke—he's a few months younger than I. And another friend I've know for over twenty years had heart surgery—he's a few months older than I. And yet another friend I've ”known-him-almost-as-long-as-Hoade” is on dialysis.
I'm coming to the horrible realization that I'm not as young as I once was. When did that happen?
I'll miss you Duke.