Friday, November 24, 2023
A Motorola 6809 assembler—there are many like it, but this is mine
I think it's time I start talking about some of the software I write, and I might as well start with my latest project that I've been having way too much fun writing, a 6809 assembler written in C.
Yes, I could use an existing 6809 assembler, but most of the ones availble as source seem to be based off one written in 1993 by L. C. Benschop. And the code quality there is … of its time … which I think is the most charitable thing I can say about it. Here's the code to convert text to a decimal number:
short scandecimal() { char c; short t=0; c=*srcptr++; while(isdigit(c)) { t=t*10+c-'0'; c=*srcptr++; } srcptr--; return t; }
Lots of globals,
lots of “magic” numbers
(at least they're described in comments),
and vwl mprd
variable names.
It's not a pleasant code base to work in.
Besides, it's something I've been wanting to do since college. So why not?
So I have a standard two-pass assembler with a few features I haven't seen in other 6809 assemblers.
And that's what I'll be describing here.
The first feature is small,
but decidedly nice—the ability to have underscores (“_”) in numberic literals.
It's more useful for binary literals,
such as %10_00_01_11
or %000_01001_0_100_0010
but it can be used for decimal, octal or hexadecimal numbers as well.
Another simple feature is the ability to generate a dependency list for make
.
Since I support the inclusion of multiple assembly files,
it makes sense to support this feature as well.
I'm not trying to make an assembler that works on the 6809 system
(I think it's way too small a system for that),
but an assembler that makes it nice to write code for a 6809 system.
I also have local labels that work similarly to NASM. As an example:
clear_bytes clra .loop sta ,x+ decb bne .loop rts clear_words stb ,-s clra clrb .loop std ,x++ dec ,s bne .loop rts
Internally,
the assembler will merge the local labels with the previous non-local label,
and thus,
we get the labels clear_bytes
, clear_bytes.loop
, clear_words
and clear_words.loop
.
I find it makes for cleaner code.
What is easier to understand,
this?
;******************************************************************** ; Music Synthesizer ;Entry: $3FF0 Freq delay count ; $3FF1 Envelope table address ; $3FF3 Envelope delay count ; $3FF5 Volume, 1 to 255 ; NOTE: from _TRS_80 Color Computer Assembly Lanauge Programming_, ; page 252 ;******************************************************************** org $3F00 mussyn lda $FF01 ; select sound out anda #$F7 ; reset MUX bit sta $FF01 lda $FF03 ; select sound out anda #$F7 ; reset MUX bit sta $FF03 lda $FF23 ; get PIA ora #8 ; set 6-bit sound enable sta $FF23 ldu #$3FF0 ; point to block ldx 1,u ; get envelope address stx envptr ; save in envptr ldx 3,u ; get envelope delay mus005 lda [envptr] ; get value beq mus090 ; if 0, done ldb 5,u ; get volume mul ; adjust volume anda #$FC ; reset RS-232-C (?) sta $FF20 ; set on ldb ,u ; get frequency delay count mus010 leax -1,x ; decrement envelope count bne mus020 ; go if not 0 ldy envptr ; increment evelope ptr leay 1,y sty envptr ldx 3,u ; get envrolope delay mus020 decb ; decrement frequency count bne mus010 ; go if not 0 lda [envptr] ; DUMMY brn *+2 ; DUMMY ldb 5,u ; DUMMY mul ; DUMMY clr $FF20 ; set off ldb ,u ; get frequency delay mus030 leax -1,x ; decrement envelope count bne mus040 ; go if not 0 ldy envptr ; increment envelope ptr leay 1,y sty envptr ldx 3,u ; get envelope delay mus040 decb ; decrement frequency count bne mus030 ; go if not 0 bra mus005 ; keep on playing mus090 rts envptr fdb 0 end mussyn
Or this?
;******************************************************************** ; Music Synthesizer ;Entry: $3FF0 Freq delay count ; $3FF1 Envelope table address ; $3FF3 Envelope delay count ; $3FF5 Volume, 1 to 255 ; NOTE: from _TRS_80 Color Computer Assembly Lanauge Programming_, ; page 252 ;******************************************************************** org $3F00 mussyn lda $FF01 ; select sound out anda #$F7 ; reset MUX bit sta $FF01 lda $FF03 ; select sound out anda #$F7 ; reset MUX bit sta $FF03 lda $FF23 ; get PIA ora #8 ; set 6-bit sound enable sta $FF23 ldu #$3FF0 ; point to block ldx 1,u ; get envelope address stx .envptr ; save in envptr ldx 3,u ; get envelope delay .next_byte lda [.envptr] ; get value beq .exit ; if 0, done ldb 5,u ; get volume mul ; adjust volume anda #$FC ; reset RS-232-C (?) sta $FF20 ; set on ldb ,u ; get frequency delay count .sound_on leax -1,x ; decrement envelope count bne .check_freq_on ; go if not 0 ldy .envptr ; increment evelope ptr leay 1,y sty .envptr ldx 3,u ; get envrolope delay .check_freq_on decb ; decrement frequency count bne .sound_on ; go if not 0 lda [.envptr] ; DUMMY brn *+2 ; DUMMY ldb 5,u ; DUMMY mul ; DUMMY clr $FF20 ; set off ldb ,u ; get frequency delay .sound_off leax -1,x ; decrement envelope count bne .check_freq_off ; go if not 0 ldy .envptr ; increment envelope ptr leay 1,y sty .envptr ldx 3,u ; get envelope delay .check_freq_off decb ; decrement frequency count bne .sound_off ; go if not 0 bra .next_byte ; keep on playing .exit rts .envptr fdb 0 end mussyn
It helps that I allow 63 characters for a label, which is way more than any 6809 assembler I've ever used.
The last feature I have are warnings. Given the following code:
.start lda <<b16,x ldb #$FF12 std foobar lda b5,u ldb b8,s tfr a,x lbsr a_really_long_label_that_exceeds_the_internal_limit_its_quite_long sta [<<b5,y] bra another_long_label_that_is_good a_really_long_label_that_exceeds_the_internal_limit_its_quite_long rts another_long_label_that_is_good clra .but_this_makes_it_too_long_to_use decb bne .but_this_makes_it_too_long_to_use bra next8 next8 lbra next1 next16 brn next8b next8b lbrn next16b next16b rts foobar equ $20 b16 equ $8080 b5 equ 3 b8 equ 25
The assembler will generate the following warnings (yes, this code is used to test all the warnings in the assembler):
warn.asm:1: warning: W0010: missing initial label warn.asm:6: warning: W0008: ext/tfr mixed sized registers warn.asm:7: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters warn.asm:12: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters warn.asm:17: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters warn.asm:19: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters warn.asm:1: warning: W0003: 16-bit value truncated to 5 bits warn.asm:2: warning: W0004: 16-bit value truncated to 8 bits warn.asm:3: warning: W0005: address could be 8-bits, maybe use '<'? warn.asm:4: warning: W0006: offset could be 5-bits, maybe use '<<'? warn.asm:5: warning: W0007: offset could be 8-bits, maybe use '<'? warn.asm:7: warning: W0009: offset could be 8-bits, maybe use short branch? warn.asm:9: warning: W0011: 5-bit offset upped to 8 bits for indirect mode warn.asm:21: warning: W0012: branch to next location, maybe remove? warn.asm:22: warning: W0012: branch to next location, maybe remove? warn.asm:1: warning: W0002: symbol '.start' defined but not used
So, in order of appearance:
W0010
- What happens if you give a local label sans a non-lobal label?
Well,
I decided to allow it,
but at least warn about it.
The result label is just
.start
but it could be hard to reference. I could see making this an error, but for now, it's just a warning. W0008
- This is the only warning about undefined behavior. The 6809 doesn't specify what happens when you transfer (or exchange) an 8-bit register with a 16-bit register (or vice versa). The CPU just keeps running, but the results are just that—undefined. Again, this could be an error, but for now, I'm letting it slide as a warning.
W0001
- Internally, the assembler just truncates labels to 63 characters, but otherwise, it just keeps going.
W0003
- This is related to the nature of a two-pass assembler and forward references.
Here,
I'm forcing the given index to a 5-bit index
(which doesn't take an additional byte of space,
unlike an 8-bit (one additional byte) or a 16-bit (two additional bytes) offset),
but the assembler has to assume it's okay on pass one.
By the time pass two comes around,
b16
is defined but it's value exceeds that 5-bits (which is -16 to 15 for the record). This warning is just letting the user know the value doesn't fit into 5-bits. W0004
- Pretty much the same as
W0003
except for an 8-bit value. W0005
- Again, due to the nature of a two-pass assembler. This time, no hint is given to the size of the label, and on pass one, the assembler assumes the worst—a 16 bit value. It's only on pass two does it have enough information to know it could be an 8-bit address, but it can't use an 8-bit address as it would throw all the other addresses off (ask me how I know).
W0006
- Similar to
W0005
, but for an offset that can fit in 5-bits. W0007
- Similar to
W0006
but for an 8-bit value. W0009
- This time, the assembler has determined that the target instruction falls within an 8-bit relative branch instruction, but was given a 16-bit relative branch instruction. This can happen because of code refactorings that shrinks the distance between the branch instruction and the target.
W0011
- One of the features of the 6809 is its support of indirect indexing.
Instead of the index having the data directly,
the index contains the address of the data
(in C parlance,
LDA ,X
isA = *X
andLDA [,X]
isA = **X
). The 6809 doesn't support this mode for 5-bit offsets, but it does for 8-bit and 16-bit offsets. This is just a warning that you can't use a 5-bit offset for this. I'm on the fence about keeping or removing this, and I'm keeping it for now. W0012
- This detects when you branch to the following instruction,
except if the instruction is
BRN
which is “branch never” (or the long branch versionLBRN
). The 6809 is unique for an 8-bit CPU with such an instruction. And despite it's apparent uselessness (why would you have a branch that is never taken) it is useful to pad out timing loops when talking to hardware. W0002
- The label wasn't referenced by any other code. And if the label is not referenced, why have the label in the first place? It could also mean an unused variable whose removal could save some space.
As you can see, most of the warnings are about code sequences that could be shorter, and I'm not aware of any assembler that gives such warnings. I could be wrong, but of the 6809 assmemblers I've used, I haven't seen anything like this.
I also have a way to supress a given warning (they're all enabled by default—I'm opinionated about this, and your stuck with my opinion if you want to use this assembler).
So that's it about the unique features I have in my assembler. I don't expect many people to use this, but I don't care, I'm having fun developing it. And that's what counts.