The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Friday, Debtember 08, 2023

The Gopher Situation, part II Unicode Booglaloo

The lead I thought I had was a red herring. I thought it may have had something to do with reporting errors back to the client as seen from the logs:

Dec 04 21:44:38	daemon	info	gopher	maxco=1 runco=0 toq=0 sc=1 mem=3012577
Dec 04 21:46:23	daemon	err	gopher	stat("/home/spc/gopher/share/MGLNDD_71.19.142.20_70") = No such file or directory
Dec 04 21:46:23	daemon	info	gopher	remote=XXXXX­XXXXX­XXXXX status=false request="MGLNDD_71.19.142.20_70" bytes=68
Dec 04 21:49:38	daemon	info	gopher	maxco=2 runco=0 toq=0 sc=1 mem=2903218
Dec 04 21:50:52	daemon	info	gopher	remote=XXXXX­XXXXX­XXXXX status=true request="CONNECT HTTP/1.1" bytes=562
Dec 04 21:54:38	daemon	info	gopher	maxco=2 runco=0 toq=0 sc=1 mem=3035838

Notice how maxco (total number of coroutines) increments and stays that way after a failed request. And the evidence was pretty convincing too:

Dec 04 22:19:38	daemon	info	gopher	maxco=2 runco=0 toq=0 sc=1 mem=3411531
Dec 04 22:23:44	daemon	info	gopher	remote=XXXXX­XXXXX­XXXXX status=true request="Phlog:2010/03/08" bytes=189
Dec 04 22:24:38	daemon	info	gopher	maxco=2 runco=0 toq=0 sc=1 mem=3185119
Dec 04 22:24:39	daemon	info	gopher	remote=XXXXX­XXXXX­XXXXX status=false request="\3\0\0/*?\0\0\0\0\0Cookie: mstshash=Administr" bytes=82
Dec 04 22:25:57	daemon	info	gopher	remote=XXXXX­XXXXX­XXXXX status=true request="Phlog:2006/11/19.1" bytes=1028
Dec 04 22:29:38	daemon	info	gopher	maxco=3 runco=0 toq=0 sc=1 mem=3242133
Dec 04 22:33:31	daemon	info	gopher	remote=XXXXX­XXXXX­XXXXX status=true request="Phlog:" bytes=978
Dec 04 22:34:38	daemon	info	gopher	maxco=3 runco=0 toq=0 sc=1 mem=3207881

Error reporting with the gopher protocol is clearly an afterthought. The official RFC has two occurances of the word “error” in it—and one of them is redundant. I did read somewhere (that's difficult to find now) that perhaps gopher should simple close the connection upon an error instead of sending an “error” to the client, so I thought I would try that. Instead of sending:

3Selector not foundHTfooHTgopher.conman.orgHT70CRLF

I would just close the connection.

That didn't work. The gopher server was still getting stuck. Attaching gdb to the stuck process didn't show anything, as the Lua executable I was using didn't have debugging symbols. So then I recompiled Lua and the modules used that were written in C to include debugging information and restarted the server, yet again.

So now I think I think I found the root issue. Attaching gdb this time showed the server was stuck in LPEG. Even better, I could see the text it was trying to parse and well … previously I said, “[t]he code hasn't changed since April.” That's not quite true. The server code hadn't changed since April, but an extension had! Back in late October I modified the code that renders my blog on gopher to use Unicode combining characters to do some typographical tricks, and it seems that the code used to wrap the text just … wasn't up to par (Unicode is hard! Let's go to Mars!).

I also noticed that my Gemini server had finally crashed—hard. And I changed that too, to use Unicode typographical tricks. So out it comes!

Let's see if this was the problem.

Wednesday, Debtember 06, 2023

Unit testing from inside an assembler, part III

I'm done with the “unit testing” backend for my 6809 assembler. The mini-Forth engine is working out fine, although the number of words increased from 41 to 47 to support some conveniences (like indexing and string comparison). It took some work to support, but the number of assertions one can make in the code is extensive. For example, a test case for this bit of code (which I do need to discuss, but that's a post for another time) looks like this:

test		sts	[$3333,x]
.next		pshs	pc,u,y,x,dp,b,a,cc

	.test	"STS"

		ldx	#.results
		ldy	#test
		jsr	init

	.assert	/x        = .results , "X=results"
	.assert	/y        = .next    , "Y=next"
	.assert	@@/0,x    = .address
	.assert	@@/2,x    = .opcode
	.assert	@@/4,x    = .operand
	.assert	@@/6,x    = .topcode
	.assert	@@/8,x    = .toperand
	.assert @.nowrite = $12         , "overwrite"
	.assert	@/-47,s   = $01		, "stack mod?"
	.assert	.address  = "0800"z     , "hex address"
	.assert	.opcode   = "10EF"z     , "hex opcode"
	.assert	.operand  = "993333"z   , "hex operand"
	.assert .topcode  = "STS"z      , "decoded opcode"
	.assert	.toperand = "[3333,X]"z , "decoded operand"


.results	fdb	.address
		fdb	.opcode
		fdb	.operand
		fdb	.topcode
		fdb	.toperand
.address	rmb	5
.opcode		rmb	5
.operand	rmb	7
.topcode	rmb	9
.toperand	rmb	19
.nowrite	nop


The code being tested is a 6809 disassembler written in 6809 assembly code (I wrote that a few years back—any testing now is academic at this point). The .TEST directive takes an optional string as the name of the test. If one isn't given, it will use the last non-local label seen in the source code as the name of the test. The first two lines:

	.assert	/x        = .results , "X=results"
	.assert	/y        = .next    , "Y=next"

assert that the X register points to .results and the Y register points to .next. I use the leading slash to denote a register instead of a label. One can use register names for labels and it's mostly unambiguous as the register is typically part of the mnemonic itself. The only exception is for the A, B and D registers, and then, only in the index addressing mode, as you can use the A, B or D register for an offset. But in the context of the .ASSERT directive it makes it easier to parse the intent if I use '/' to designate a register. Each register, and each bit in the condition code register (like /cc.z for the zero-flag) can be used. The bit after the comma, “X=results”, will be printed if the check fails:

test-disasm.asm:7: warning: W0015: STS:13 X=results: test failed:

(there can be text after the “test failed” bit, thus the colon).

The next few lines:

	.assert	@@/0,x    = .address
	.assert	@@/2,x    = .opcode
	.assert	@@/4,x    = .operand
	.assert	@@/6,x    = .topcode
	.assert	@@/8,x    = .toperand

assert the contents of memory pointed to by X. The double “@” fetches 16 bits from the address following, and in the first line, this is the address in the X register. The second line retrieves the 16 bits from the address two bytes past where the X register points to. You could write these lines as:

	.assert	@@(/x + 2) = .opcode

but a little syntactic sugar never hurts, and it mimics the native method of using the index registers. This was possibly the hardest bit of code to write, as the index addressing mode of the 6809, while great from an assembly programmer's perspective, is a nightmare from an assembler-implementer's perspective. Even here, where it's simplified, was a pain to get right, but I think it was worth it.

The next two lines:

	.assert @.nowrite = $12         , "overwrite"
	.assert	@/-47,s   = $01		, "stack mod?"

check that the given addresses, nowrite and a byte down in the system stack, contain certain 8-bit values. Each byte of the memory in the virtual 6809 system is filled with the value 1 (it can be changed on the command line), so here, each untouched byte will contain a 1. I picked that value since it's an illegal opcode, which the emulator will trap.

The final few lines:

	.assert	.address  = "0800"z     , "hex address"
	.assert	.opcode   = "10EF"z     , "hex opcode"
	.assert	.operand  = "993333"z   , "hex operand"
	.assert .topcode  = "STS"z      , "decoded opcode"
	.assert	.toperand = "[3333,X]"z , "decoded operand"

does indeed, do a string compare. And therein lies a tale. Again, this is a form of syntactic sugar:

	.assert	@.address=$30 && @(.address+1)=$38 && @(.address+2)=$30 && @(.address+3)=$30 && @(.address+4)=0

This was the second hardest bit to to support, is a bit fragile, and, if I'm honest, a hack. The string literal has to be on the right hand side of the conditional, and worse, there's no easy way to enforce this in the assembler (so I currently don't). Third, the second string has to be a literal string—you can't compare two different memory regions from the 6809 VM. There's also a limit of only one string literal per .ASSERT directive, again, because supporting more than one would vastly complicate the already somewhat complicated code (this “unit test“ backend is already 30% of the entire assembler).

To keep from having to add a ton of code for the conditional checks to support two different primitive types, or to keep from having to create a duplicate set of string conditionals, I cheated (or came up with a brilliant hack—take your pick). The code generated is:

VM_LIT .address

That VM_SCMP is hiding things—it knows which string literal to use (as it's part of the VM program and there's only space for one string literal per .ASSERT directive) but it also leaves two values on the stack: -1,0 if the result is less than, 1,0 if the result is greater than, and 0,0 if the result is equal. This way, the conditional operators can work as is.

Oh, those “z”s on the end of each string literal? Well, the assembler supports several methods of storing string data in memory. There's the standard C NUL terminated strings; the OS-9 method of setting bit 7 of the last character of the string, and the sometimes used method where the first character of the string is actually the length. I originally had separate non-standard directives to support these methods, so when I wanted to support string-comparisons, I needed a way to support these methods. Then it hit me—the use of a suffix on the string—“Z” for the NUL terminated one (“Z” stands for “zero”), “H” for the bit 7 set (“H” for “high-bit”) and “C” for counted strings. And if I'm using the suffixes for the “unit test” backend, why not in general? So I replaced the .ASCIIZ and .ASCIIH directives (I was contemplating adding counted strings but I never got around to adding .ASCIIC) with just .ASCII and the use of a suffix (no suffix, string is left as-is).

So, back on track. The expressions can get quite involved. Some examples:

 .assert /b             = -(@lfsr & 1) & $B4
 .assert @tvalue        = $10*3+(1<<3)+2*2+(7-5)+1
 .assert @@(tvalue + 1) = $10+3+1<<3+2*2+7-5+1

You are also not limited to using the .ASSERT, .TRON and .TROFF directives inside a .TEST directive. You can put them anywhere in the codebase, and if that code is executed as part of a “unit test”, they'll trigger (and if you aren't using the “unit test” backend, they're ignored outright).

There are other changes too—each backend will parse its own command line options, I added some new warnings (such as a waring for self-modifying code), and the memory of the virtual 6809 can have various protections (read-only, write-only, execute-only, trace) set from the command line for further testing.

Now I just need to update the README.txt file and release the code.

Tuesday, Debtember 05, 2023

Notes on an overheard conversation at 4:15 am


“Hey! I heard the screen door slam! Could you check the front porch?”

“Who would be here at … 4:00 am‽”

“Probably an Amazon delivery. I ordered a white noise machine last night.”


“Just go look!”

“Hmm … whoa! Lovely! They left the package right on the door step. Sorry about stepping on it. And yes, it's from Amazon.”

“It arrived! My white noise machine!”

“When did you say you ordered it?”

“9:00 pm. They said it would arrive shortly.”

“That's insane!”

“That's Amazon.”

Monday, Debtember 04, 2023

The Gopher Situation

Over the past few days, I've been battling a pernicious bug in my gopher server wherein it becomes CPU bound and cause other issues on the server. I then have to go in and kill the gopher process (and the one time I couldn't even do that—I had to have the virtual server restarted). I initially attributed this to an over-aggressive bot crawling my site and blocked it with iptables but even that didn't solve the issue.

The problem is—nothing to my knowledge has changed on my virtual server, nor the server that it is running under, nor the network it's on. My Gemini server gets way more traffic than my gopher site and it's fine, and the only difference between the two is—the Gemini server uses TLS but otherwise, is nearly identical to the gopher server.

It's very odd.

The other day I added some code (in a branch, not in the main line version) to log memory usage, number of threads (technically, Lua coroutines), number of running threads, number of waiting threads, and number of active sockets. And since adding that, the gopher server has been running fine, but just now I do see a potential problem—the number of threads is two higher than the number of actual connections, which “shouldn't” happen.

Woot! I now have a lead on the problem!

But I do wonder what recently caused the issue? The code hasn't changed since April, and now I'm wondering if my Gemini server has a similar issue, since the code bases are similar in nature.

I've been blogging for 757,382,400 seconds

Wow! It's been a full 365 days since my blog went under the lock! And it's been a full 8,766 days since I started blogging. Here's to another 8,766 days!

Friday, Debtember 01, 2023

Unit testing from inside an assembler, part II

I started working on unit tests from inside the assembler. I'm not sure how MOS does it (as I don't read Rust) so I'm making this up as I go along. I'm using the following file as a test case for the work:

lfsr		equ	$F6

		org	$4000
start		bsr	random

the.byte	fcb	$55
the.word	fdb	$AAAA

;	RANDOM		Generate a random number
;Entry:	none
;Exit:	B - random number (1 - 255)

random		ldb	lfsr
		andb	#1
		andb	#$B4
		stb	,-s
		ldb	lfsr
		eorb	,s+
		stb	lfsr

	.test	"random"
		ldx	#.result_array
.setmem		sta	,x+
		bne	.setmem
		ldx	#.result_array + 128
		lda	#1
		sta	lfsr
		lda	#255
.loop		bsr	random
	.assert	/B <> 0 , "degenerate LFSR"
		tst	b,x
	.assert	/CC.z <> 1 , "non-repeating"
		inc	b,x
		bne	.loop
	.assert @the.byte == $55 && @@the.word == $AAAA , "tis a silly test"
.result_array	rmb	256




		end	start

I've made the “unit test” … thing, a backend (like I have for binary and Color Computer-specific output as backends) because it's less intrusive on the code and I wasn't sure where to assemble the test code (within the memory space of the 6809). By making this a specific backend, it should be apparent that this is not for the final version of the code.

So far, I have it such that all the non-test backends don't see the code at all:

                         | FILE test.asm
                       1 | 
                       2 | lfsr            equ     $F6
                       3 | 
                       4 |                 org     $4000
4000: 8D    04         5 | start           bsr     random
4002: 39               6 |                 rts
                       7 | 
4003: 55               8 | the.byte        fcb     $55
4004: AAAA             9 | the.word        fdb     $AAAA
                      10 | 
                      11 | ;***********************************************
                      12 | ;       RANDOM          Generate a random number
                      13 | ;Entry: none
                      14 | ;Exit:  B - random number (1 - 255)
                      15 | ;***********************************************
                      16 | 
4006: D6    F6        17 | random          ldb     lfsr
4008: C4    01        18 |                 andb    #1
400A: 50              19 |                 negb
400B: C4    B4        20 |                 andb    #$B4
400D: E7    E2        21 |                 stb     ,-s
400F: D6    F6        22 |                 ldb     lfsr
4011: 54              23 |                 lsrb
4012: E8    E0        24 |                 eorb    ,s+
4014: D7    F6        25 |                 stb     lfsr
4016: 39              26 |                 rts
                      27 | 
                      28 |         .test   "random"
                      29 |                 ldx     #.result_array
                      30 |                 clra
                      31 |                 clrb
                      32 | .setmem         sta     ,x+
                      33 |                 decb
                      34 |                 bne     .setmem
                      35 |                 ldx     #.result_array + 128
                      36 |                 lda     #1
                      37 |                 sta     lfsr
                      38 |                 lda     #255
                      39 |         .tron
                      40 | .loop           bsr     random
                      41 |         .assert /B <> 0 , "degenerate LFSR"
                      42 |                 tst     b,x
                      43 |         .assert /CC.z <> 1 , "non-repeating"
                      44 |         .troff
                      45 |                 inc     b,x
                      46 |                 deca
                      47 |                 bne     .loop
                      48 |         .assert @the.byte == $55 && @@the.word == $AAAA , "tis a silly test"
                      49 |                 rts
                      50 | .result_array   rmb     256
                      51 | 
                      52 |         .endtst
                      52 |         .endtst
                      53 | 
4017: 12              54 |                 nop
                      55 | 
                      56 | ;***********************************************
                      57 | 
                      58 |                 end     start

    2 | equate      00F6     3 lfsr
   17 | address     4006     1 random
    5 | address     4000     1 start

Ignore that line 52 shows up twice here—that's a bug that I'll work on (my initial fix removed the duplicate line, but line 51 didn't show up—it's not a show-stopping bug which I why it's going on the “fix it later” list). Also, the labels the.byte and the.word don't show up on the symbol list at the end due to a “feature” where labels that aren't referenced aren't printed (that was to remove unused equates from the symbol list). So for the non-test backends, the actual testcase isn't part of the build.

The other added directives, like .tron, .troff and .assert are also ignored by the other backends if the directives appear outside a “unit test.”

With the .test backend though, all the directives are recognized and most of them work, although I'm still working on .assert (see below).

One issue—when to run the actual tests. Right now, the code is run when then .endtst directive is hit, as running the code as it's assembled won't work well I think, especially with branches and calls to other routines, and it would be a nightmare to get correct. It's easier if all the code exists in “memory,” but one issue I've noticed is that any code further down in the file can't be used. I'll have to move the execution of tests to after the assembly pass is done.

The .tron and .troff directives work, dumping out the instructions between them as the code is run:

... lots of lines cut
PC=402A X=40B4 Y=0000 U=0000 S=7FFE DP=00 A=09 B=D2 CC=-f-i---c | 402A 8D   DA     - BSR   4006               ; ----- backwards 
PC=402C X=40B4 Y=0000 U=0000 S=7FFE DP=00 A=09 B=69 CC=-f-i---- | 402C 6D   85     - TST   B,X                ; -aa0- 411D = 00
PC=402A X=40B4 Y=0000 U=0000 S=7FFE DP=00 A=08 B=69 CC=-f-i---- | 402A 8D   DA     - BSR   4006               ; ----- backwards 
PC=402C X=40B4 Y=0000 U=0000 S=7FFE DP=00 A=08 B=80 CC=-f-in--c | 402C 6D   85     - TST   B,X                ; -aa0- 4034 = 00
... more lines cut

Another issue is dealing with the .assert directive. I have to save the test somehow since the assembler can't do the check when it parses the .assert because not all the code for the test has been assembled yet. I could store the text to the test expression and then evaluate it at run time, but as this code shows, that would mean re-interpreting the text many, many times. No, the solution I came up with is a mini-Forth-like language for evaluating the test expression.

Yup, I'm embedding a mini-Forth interpreter in a 6809 assembler written in C.

A classic blunder I'm sure, like getting involved in a land war in Asia, or going against a Sicilian when death is on the line, but I'm not sure of any other way. The mini-Forth is very small though, only 41 words are defined, but it's enough for my needs. The first .assert expression translates to:

VM_CPUB		( push contents of the B register onto the stack )
VM_LIT 0	( push a literal 0 onto the stack )
VM_NE		( compare the two, leaving a flag on the stack )
VM_EXIT		( exit the VM )

The second one to:

VM_CPUCCz	( push the CC zero flag )
VM_LIT 0	( push a literal 0 )
VM_NE		( compare the two, leave flag on stack )
VM_EXIT		( exit the VM )

And the last one to:

VM_LIT 0x4003	( push the literal 0x4003 )
VM_AT8		( fetch the byte from the 6809 memory buffer )
VM_LIT 0x55	( push the literal 0x55 )
VM_EQ		( compare the two, leave flag on stack )
VM_LIT 0x4004	( push the literal 0x4004 )
VM_AT16		( fetch two bytes from the 6809 memory buffer )
VM_LIT 0xAAAA	( push the literal 0xAAAA )
VM_EQ		( compare the two, leave flag on stack )
VM_LAND		( AND the two results, leaving flag on stack )
VM_EXIT		( exit the VM )

This works, and it was easy to implement the VM. Now all I have to do is parse the expression to assemble the VM code (right now the addresses and VM functions are hard coded into the assembler just to prove it works).

This feature is proving to be an interesting problem.

Wednesday, November 29, 2023

Unit testing from inside an assembler

Plug plug: I've written an assembler[0] for the 6502 (with full LSP and debugging support). It also supports the concept of unit tests whereby your program gets assembled and every test individually gets assembled and run, whereby you can add certain asserts to check for CPU register states and things like that.

[0] See

Plug plug: I've written an assembler[0] for the 6502 (with full LSP and debuggin... | Hacker News

This comment (from the Orange Site about a previous post) grabbed my attention. I'm fascinated by the feature, and I think that's because the test is run in the assembler! (As a side note—I think they missed an opportunity by not using TRON to enable tracing) I'm thinking I might try to add a feature to my my assembler, as I've already written a 6809 emulator as a library.

If I already had this feature (and riffing off the sample), how might this look? What are some of the issues that might come up? I marked up the random function as I might have done during testing:

;	RANDOM		Generate a random number
;Entry:	none
;Exit:	B - random number (1 - 255)

random		ldb	lfsr
		andb	#1
		andb	#$B4
		stb	,-s		; lsb = -(lfsr & 1) & taps
		ldb	lfsr
		lsrb			; lfsr >>= 1
		eorb	,s+		; lfsr ^=  lsb
		stb	lfsr

	; --------------------

	.test	"random"
		ldx	#.result_array + 128
		lda	#1
		sta	lfsr
		lda	#255
.loop		bsr	random
	.assert	cpu.B <> 0 , "degenerate LFSR"
		tst	b,x
	.asert	cpu.CC.z <> 1
		inc	b,x
		bne	.loop
.result_array	rmb	256


First off, I would have the tracing always print results—that way I can follow the flow to help see the issue. One open question—would that be a command line option? Or as I have it here—a pseudo operation? Second, how would I return from the code? The sample I'm going off uses BRK (the 6502 software interrrupt instruction). I suppose I could use SWI but I would also want to fill unused memory with that instruction in case the code goes off into the weeds, so I would need a way to detect the difference. I don't want to juse use .endtest to end the code sequence, as I might also want to include variables, like I did here.

Another example, this time the function that had the bug in it:

;	GETPIXEL	Get the color of a given pixel
;Entry:	A - x pos
;	B - y pos
;Exit:	X - video address
;	A - 0
;	B - color

getpixel	bsr	point_addr	; get video address
		comb			; reverse mask (since we're reading
		stb	,-s		; the screen, not writing it)
		ldb	,x		; get video data
		andb	,s+		; mask off the pixel
		tsta			; any shift?
		beq	.done
.rotate		lsrb			; shift color bits
		bne	.rotate
.done		rts			; return color in B

	.test	"getpixel"
		ldd	#.screen
		std	ECB.beggrp
		lda	#0		; X
		lda	#0		; Y
		bsr	getpixel
	.assert cpu.X = #.screen
	.assert	cpu.B = 3
		lda	#1
		ldb	#0
		bsr	getpixel
	.assert cpu.X = #.screen
	.assert	cpu.B = 3
		lda	#2
		ldb	#0
		bsr	getpixel
	.assert	cpu.X = #.screen
	.assert	cpu.B = 3
		lda	#3
		ldb	#0
		bsr	getpixel
	.assert cpu.X = #.screen
	.assert	cpu.B = 3
.screen		fcb	%11_11_11_11	; our four pixels

More questions: should I be able to trace non-test code? Probably, as that could help with debugging issues. Also, the function being tested is calling another function which just happens to be a forward reference, which tells me that calling the tests should happen on pass two of the assembler. And that brings up further questions—what about code like this?

GIVABF		equ	$B4F4

		org	$7000
checksum	jsr	INTCNV		; get parameter from BASIC
		tfr	d,y		; it should point to a string variable
		ldx	2,y		; get address
		lda	,y		; get length
		clrb			; clear checksum and Carry bit		
.sum		adcb	,x+		; add
		bne	.sum
		comb			; 1s compliment
		clra			; return 0-255 result
		jmp	GIVABF		; return result to BASIC

	.test	"checksum"
		ldd	#.tmpstr	; our "string"
		jsr	GIVABF		; give address to BASIC
		bsr	checksum
		jsr	INTCNV		; get our result from BASIC
	.assert	cpu.D = 139		; if I did my math right

.tmpstr		fcb	5
		fcb	0
		fdb	.text
		fcb	0
.text		fcc	/HELLO/

The two routines INTCNV and GIVABF are ROM routines (from the Color Computer BASIC system) so we don't have the code for the emulator, and therefore, this code can't be tested as is. I suppose it could be rewritten such that it can be tested (and use more memory, which could be an issue) but this does show the limitation of this technique.

I suppose one fix would be conditional assembly:

.value		fdb	0
INTCNV		ldd	.value
GIVABF		std	INTCNV.value
GIVABF		equ	$B4F4

but personally, I'm not a fan of conditional code, but I shouldn't discount this as a solution.

Another issue is labels. I've been using local labels for the testing code, thinking that there would be a unique non-local label for each test (generated by the assembler) to avoid naming conflicts (naming is hard). I need to think on how I want to handle this.

It's an interesting idea though …

Yes, that is a hard DNS problem

… Instead of starting with a new zone and copying over some necessary entries from the old zone (what I would have done), someone had simply(?) aliased the new zone over to the old one. Then, when it eventually became necessary to change the new zone (these things take time, and memories can become lost, like rings at the bottom of a river) the records would not take as the whole zone was still aliased to the old one.

I cannot reproduce the (reported) issue with nsd, as nsd fails the zone with a "DNAME at has data below it" error. However, they were not using nsd; probably their name server allowed a mix of DNAME and thus shadowed-by-the-alias records …

The Long Tail of DNS Record Types

As I last wrote, “I can see it either being something very trivial and I'll kick myself for not seeing, or it's something that I've not had experience dealing with at all,” and it does appear to be something I've not had experience dealing with at all.

The DNAME RR is to delegate name resolution to another server, mainly for address-to-name mappings, but also for aliases. I recall doing a form of name delegation using a non-kosher method back in the late 1990s and early 2000s (back when I was wearing a “sysadmin” hat) involving NS RRs but not with DNAME. DNAME didn't exist when I started with delegations, thus, no experience with it.

And yes, that would be a hard DNS problem if you never encountered it before.

Tuesday, November 28, 2023

Still a hard DNS problem

The zone file is entirely correct as far as syntax goes and was updated with the new record without error. The new record does not appear in queries about it, but does appear in the new zone file even on the secondary servers.

Re: A hard DNS problem

Ah, there's more information about the problem. I did mention that “[i]f the record does show up, then it's a propagation issue, maybe related to caching or TTL issues.” But to be fair, there could be a few other issues. I don't think it's an issue of the zone file was updated but the DNS servers weren't restarted—I don't get that from the wording, and there's a quick test for that anyway—check the serial number by requesting the SOA RR.

Another issue to check is what the root DNS servers think the authoritative DNS servers for the zone are. A quick check of whois could provide that information, or even a query outside the network for the NS RR for the domain. If they don't match the expected list of DNS servers, then either the domain expired, was transfered, or someone else in the organization updated the NS records for the domain.

But if the NS RRs are correct, and I can see the proper serial number from an SOA query from outside the network, but not the new record … I don't know. I might try to use a few different locations outside the network to do queries from, just to make sure it's not the DNS server I'm using for queries, but if they all exhibit the behavior … I doubt it'll be an unsupported RR type, perhaps something to do with DNSSEC? Which is beyond my paygrade …

I would like to know the actual issue is—I can see it either being something very trivial and I'll kick myself not not seeing, or it's something that I've not had experience dealing with at all.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site:, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2023 by Sean Conner. All Rights Reserved.