The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Monday, November 27, 2023

Unit testing on an 8-bit CPU

I've been using my assembler to write silly little programs for the Color Computer (simulated—easier than setting up my Color Computer, the first computer I owned as a kid). It took me entirely too long to locate a bug in a maze-drawing program I've been writing, and I want to do a deep dive on this.

The program is simple, it just draws a maze, following a few simple rules:

  1. pick a random direction (up, right, left or down);
  2. attempt to draw a maze segment (using color 1) in the given direction;
  3. if we're boxed in (no direction to go in)—
    1. if we're at the starting location, we're done;
    2. backtrack along a segment of the path we've drawn (using color 2);
    3. if we are no longer boxed in, go back to step 1;
    4. otherwise, go back to step 3.

Nothing hard about it, yet the program kept getting stuck.

[Image of a partially drawn maze]

It starts in the upper left, meanders a bit (in blue), backtracks a bit (in red) until it gets stuck. I would then stare at the code until blood formed on my brow, simplify the code if I could, and try again only to watch it get stuck, again.

The issue is that I have no debugger on this system. I don't have two screens upon which to show debugging output. I have no way of single-stepping though the code. I don't even have a log to go through. Debugging consists of running the code, then thinking real hard. And yes, it was a stupid mistake that took all of two lines of code to fix.

[Image of a fully drawn maze]

Now, the question I want to ask is—would I have saved time if I did “unit testing?” Not just testing, which I was doing all along, but the typical style of “unit testing” where you test a “unit” of code (getting back to that question again). Looking over the code (and that version has the bug—see if you can find it; you'll also need these two files), the most isolated “unit” is random (the last subroutine in the code).

;***********************************************************************
;	RANDOM		Generate a random number
;Entry:	none
;Exit:	B - random number (1 - 255)
;***********************************************************************

random		ldb	lfsr
		andb	#1
		negb
		andb	#$B4
		stb	,-s		; lsb = -(lfsr & 1) & taps
		ldb	lfsr
		lsrb			; lfsr >>= 1
		eorb	,s+		; lfsr ^=  lsb
		stb	lfsr
		rts

It implements a linear-feedback shift register with a period of 255 (in that it will return each value in the range of 1‥255 once in some randomesque order before repeating, which is Good Enough™ for this code). So what if we want to test that the code is correct?

And it's here I came to a realization—“unit testing” really depends upon the language and tooling around it. Modern orthodoxy holds “unit testing über alles” and therefore, modern languages and tooling is created to support “unit testing über alles” (and woe betide those who question it). I think my struggle with “unit testing” is that the environments I find myself in don't lend themselves very well to “unit testing.” Even when I worked at The Enterprise, we were using C (C99 at best) and C++ (C++98, maybe C++03?) which take a lot of work upfront to support “unit testing” well, and there wasn't a lot of work upfront to support “unit testing” well, and there was a decidely lack of a “unit testing” framework. And here, there's definitely not a “unit testing” framework. Any “unit testing” I have to do involves writing yet more code. So let's write yet more code and test this sucker.

CHROUT		equ	$A002		; BASIC character output routine
lfsr		equ	$F6		; unused location in direct page

		org	$4000

start		ldx	#result_array	; point to memory
		clra			; storing 0 to memory
		clrb			; 256 bytes to clear

.clrarray	sta	,x+		; clear memory
		decb			; decrement count
		bne	.clrarray	; keep going until count = 0

		ldx	#result_array	; point to array
		lda	#255		; cycle length

checkrnd	bsr	random		; get random number in B
		tst	b,x		; have we seen this number?
		bne	.failed		; if so, we have failed
		inc	b,x		; set flag for this number
		deca			; are we done with the cycle
		bne	checkrnd	; if not, keep going

		ldx	#msg.success	; SUCCESS!
		bsr	puts
		rts

	;---------------------------------------------------
	; Store count (register A) and random # (register B) 
	; so we can use PEEK in BASIC to see where we failed
	;---------------------------------------------------

.failed		std	lfsr+1
		ldx	#msg.failed	; failed message
		bsr	puts		; display it
		rts			; return to BASIC

puts.10		jsr	[CHROUT]	; print character
puts		lda	,x+		; get character
		bne	puts.10		; if not NUL, print it
		rts			; return to BASIC

;******************************************************************

random		ldb	lfsr
		andb	#1
		negb
		andb	#$B4
		stb	,-s		; lsb = -(lfsr & 1) & taps
		ldb	lfsr
		lsrb			; lfsr >>= 1
		eorb	,s+		; lfsr ^=  lsb
		stb	lfsr
		rts

;*******************************************************************

msg.success	asciiz	'SUCCESS!\r'
msg.failed	asciiz	'FAILED!\r'
result_array	equ	*
		end	start

The tooling here doesn't support linking 6809 code, and I'd rather not have to keep each routine in its own file since the program is so simple and makes editing it easier if everything is in one file (no IDE here—and yes, I have thoughts about testing and IDEs but that's beyond the scope for this post). So I have to copy the routine to the test program.

This was something I kept trying to tell my manager at The Enterprise—the test program itself might be buggy (he personally treated the output as gospel—sigh). And the “unit testing” proponents seem to hem and haw about testing the testing code, implying that one does not simply test the testing code. But if programmers aren't trusted to write code and must test, then why are they trusted to write testing code without testing?

It might seem I digress, but I'm not. There are four bugs in the above test. The code I'm testing, random? It was fine. And I wasn't intending to write a buggy test harness, it just happened as I was writing it. Bug one—I forgot to test that random didn't return 0 (that's a degenerate case with LFSRs). Second bug—I forgot to ininitialize the LFSR state with a non-zero value, so random would return nothing but zero. The third bug was thinking I had used the wrong condition when branching to the failure case, but no, I had it right the first time (the fact that I changed it, and then changed it back, is the bug). The actual bug that caused this was the fourth bug, but I have to digress a bit to explain it.

The 6809 has an extensive indexing addressing mode for an 8-bit CPU. One of the modes allow one to use an accumulator register (A, B or D) as an offset to the index register. I used the B register, which contains the random number, as an offset into a 256-element array to track the return values, thus the use of b,x in the code above. What I forgot about in the moment of writing the code is that the “accumulator,index-register” indexing mode sign extends the accumulator. And the first value from random is, due to the LFSR I'm using, if treated as signed, a negative value—it would fail on the very first attempt.

Sigh.

This is why I panicked and thought I botched the conditional branch.

Now, all of that just to test the most isolated of subroutines in the program.

But had I continued, would any form of “unit testing” been beneficial? There's the subroutine point_addr—which converts an X,Y position into a byte address in the frame buffer, and the pixel in said byte. I could have done an exhaustive test of all 4,096 points, again, that's code I would have write (in 6809 Assembly code) and unfortunately, test, to have any confidence in it. And working up the chain, there's getpixel and setpixel. Testing those would require a bit of thought—let's see … getpixel returns the color of the pixel at the given X,Y location on the screen. and assuming point_addr is working, it would only take four tests (one per pixel in the byte) but at this point, would I even trust myself to write the test code?

In fact, would “unit testing” have saved me any time?

Given that I would have to write the testing framework, no, I don't think I would have saved time. Perhaps if I thought the issue through before diving into changing the code, I would have solved this earlier.

And the clues were there. I did discover pretty early on that the bug was in the backtracking code. The top level code is pretty easy to visually inspect:

backtrack	lda	#BACKTRACK
		sta	color

.loop		ldd	xpos		; check to see if we're back
		cmpd	xstart		; at the starting point,
		beq	done		; and if so, we're done

		ldd	xpos		; can we backtrack NORTH?
		decb
		lbsr	getpixel
		cmpb	#EXPLORE
		bne	.check_east
		lbsr	move_north.now	; if so, move NORTH and see if
		bra	.probe		; we have to keep backtracking

.check_east	ldd	xpos		; east ...
		inca
		lbsr	getpixel
		cmpb	#EXPLORE
		bne	.check_west
		lbsr	move_east.now
		bra	.probe

.check_west	ldd	xpos		; yada yada ...
		deca
		lbsr	getpixel
		cmpb	#EXPLORE
		bne	.check_south
		lbsr	move_west.now
		bra	.probe

.check_south	ldd	xpos
		incb
		lbsr	getpixel
		cmpb	#EXPLORE
		bne	.probe
		lbsr	move_south.now

.probe		bsr	boxed_in	; can we stop backtracking?
		bne	explore		; if so, go back to exploring
		bra	.loop		; else backtrack some more

The thing to keep in mind here is that the D register is a 16-bit register where the upper 8-bits is the A register, and the lower 8-bits are the B register, and that the 6809 is big-endian. So when we do ldd xpos we are loading the A register with the X coordinate, and B with the Y coordinate. And the move_*.now subroutines work, or else we wouldn't get the starting maze segments at all. So it's clear that setpixel works fine.

The code is getting stuck trying to backtrack along already drawn segments, and it does that by calling getpixel, therefore, it seems prudent to check getpixel. And sure enough, that is where the bug resides.

;*************************************************************************
;	GETPIXEL	Get the color of a given pixel
;Entry:	A - x pos
;	B - y pos
;Exit:	X - video address
;	A - 0
;	B - color
;*************************************************************************

getpixel	bsr	point_addr	; get video address
		comb			; reverse mask (since we're reading
		stb	,-s		; the screen, not writing it)
		ldb	,x		; get video data
		andb	,s+		; mask off the pixel
.rotate		lsrb			; shift color bits
		deca
		bne	.rotate
.done		rts			; return color in B

;*************************************************************************
;	POINT_ADDR		calculate the address of a pixel
;Entry:	A - xpos
;	B - ypos
;Exit:	X - video address
;	A - shift value
;	B - mask
;*************************************************************************


point_addr.bits	fcb	%00111111,%11001111,%11110011,%11111100 ; masks
		fcb	6,4,2,0	; bit shift counts

I've included a bit of point_addr to give some context. point_addr returns the number of shifts required to move the color value into place, and one of those shift values is 0. But getpixel doesn't check to see it's 0 before decrementing it. And thus, getpixel will return the wrong value for the last pixel in any byte. The fix is simple:

		tsta
		beq	.done

just before the .rotate label fixes the bug (two instructions, three bytes and here's a link to the fixed code).

Yes, I freely admit that a “unit test“ of this subroutine would have shown the bug. But how much time would I have spent writing the test code to begin with? The only reason it took me as long as it did to find was because the reference code I was using was quite convoluted, and I spent time simplifying the code as I went along (which is worthy of doing anyway).

What I wish the “unit testing” proponents would realize is that easy testing depends upon the language and tooling involved in the project, and what a “unit” is truly depends upon the language. I suspect that “unit test” proponents also find “unit testing” easier to deal with than “integration testing” or even “end-to-end testing,” thus why we get “unit tests über alles” shouted from the roof tops.


Discussions about this entry


A hard DNS problem

While I'm in a testing mood, I came across this post:

The job was for a position involving the day-to-day operation of DNS servers and firewalls involving DNS and client systems involving DHCP and DNS systems, and was a senior role so the ideal candidate would probably be able to say a little more than "magic" as to how DNS works.

A hard DNS problem: the owner of the zonefile added a new record that you requested. The serial number of the zone did increment (and there is no funny invalid wrap-around of the serial number going on (bonus points for knowing that)). The new record is not visible. What went wrong? This is probably too hard a question for a job interview, though you might explain all this and then ask how they would go about debugging the problem.

Magic

The first paragraph sets up the context, and at the end, presents a DNS problem. I worked with DNS before, and this doesn't seem that hard a question.

So, I've noticed an issue with a record I wanted added to a zone file, say a TXT RR for foo.example.com that reads “I have a red pencil.” I'm also assuming I've done a check from an outside network and didn't see the record, and that looking up the SOA RR (also from an outside network) showed the new serial number. My first step would be to query the authoritative name servers (typically two to four, could be more) and see if the record is there. If the record does show up, then it's a propagation issue, maybe related to caching or TTL issues. If not, then my guess is that the owner of the zonefile messed up adding the record, possibly by adding a spurious “.” to the end of the domain name—the owner effectively added

foo.	IN	TXT	"I have a red pencil."

This might not even show up as an error in any log files, but this is as-if the record was never added. But if the record was indeed added correctly, as in:

foo	IN	TXT	"I have a red pencil."

Then the next place to look is at how the change was made—did the owner update the record in some editor that then failed to properly update DNS? That would be the next place to look, and the results from that should indicate where to look next.

Why yes, I have done my fair share of DNS troubleshooting … why do you ask?

Update on Tuesday, November 28th, 2023

Some more information came back.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.