Tuesday, November 26, 2024
The definitive guide to writing assembly language subroutines for Color BASIC
There's nothing quite like documenting 40 year old technology, but hey, retro-computing is now popular, so why not?
Anyway, since I've modified my assembler to make it easier to write assembly subroutines for Color BASIC, I've been doing a deep dive into the nuances of doing so. This post will cover the method for plain Color BASIC; Extended Color BASIC, which does things a bit differently, will be covered in another post.
The information in Getting Started With Color BASIC is a bit light.
It covers how to use POKE
to load the object code into memory,
and how to define the address for use by the USR
function by poking that address into memory,
but that's it.
It gives one sample program:
LOOP1 JSR [POLCAT] ;POLL FOR A KEY BEQ LOOP1 ;IF NONE, RETRY CMPA #10 ;CTRL KEY (DN ARW)? BNE OUT ;NO, SO EXIT LOOP2 JSR [POLCAT] ;YES, SO GET NEXT KEY BEQ LOOP2 ;IF NONE, RETRY CMPA #65 ;IS IT A-Z? BLT OUT ;IF <A, EXIT SUBA #64 ;CONVERT TO CTRL A/Z OUT TFR A,B ;GET RETURN BYTE READY CLRA ;ZERO MSB JMP GIVABF ;RETURN VALUE TO BASIC POLCAT EQU 40960 GIVABF EQU 46324
It shows how to return a value to Color BASIC, but doesn't fully explain the BASIC call:
110 A = USR(0) 'CALL THE SUBROUTINE AND GIVE RESULT TO A
Why the 0 to USR
?
How do we get it?
There is no explanation.
The book TRS-80 Color Computer Assembly Language Programming goes into more depth, explaining how to retrieve the argument and even how to pass in a string and not just a numeric parameter (although it uses a function only available in Extended Color BASIC). Neither go into any real depth on how this all works.
I'm going into that depth.
First off, Color BASIC only supports two data types—numeric (or float) and strings. Numbers are in the Microsoft BASIC floating point format, which are five bytes in length. Strings are stored in two parts—the first is a “string descriptor,” which is also five bytes (to keep the same size as number). Only three bytes are used, one byte for the length (0 to 255) and two bytes for the second part of the string, a pointer to the actual contents. This is done for a few reasons. One, the string can be defined anywhere in memory, not just the string pool used for dynamic strings. Second, the string pool can be subject to garbage collection which can change the location of string data. So while the descriptor doesn't change location, the pointer to the actual string contents might!
Now,
when you call the assmbly language subroutine via USR()
,
the BASIC variable FP0
(located at address $0050) contains the result of the expression given to USR()
.
This is a floating point value.
You can use the function INTCVT
(located at address $B3ED, which is mentioned in Getting Started With Extended Color BASIC),
to convert this into the 16-bit D
register.
The CPU registers themselves have no defined value upon input.
To return a 16-bit value,
you can call GIVABF
(located at address $B4F4) with the value in the D
register.
You can also call GIVBF
(located at address $B4F3) with an 8-bit unsigned value in the B
register
(not documented by Tandy—more on this in a bit).
Furthermore,
no CPU registers need to be saved by the assembly language subroutine.
Putting this together,
we can write a simple subroutine such as:
INTCVT equ $B3ED ; put argument into D GIVABF equ $B4F4 ; return D to BASIC org $7F00 swapbyte jsr INTCVT ; get argument exg a,b ; swap bytes jmp GIVABF ; return it to BASIC end
And while both Getting Started With Extended Color BASIC and TRS-80 Color Computer Assembly Language Programming both mention passing strings to an assembly language subroutine,
they both state you must pass in a pointer to the string descriptor with VARPTR
(this function will return the address of both string and numeric variables),
this isn't completely true.
Color BASIC will call the generic expression parsing routine for the parameter to USR()
and this can be either a numberic expression or a string expression!
In either case,
the variable FP0
will contain the result of the expression,
and the variable VARTYP
(located at address $0006) will contain a 0 for a numerica value,
or 255 for a string value.
In the case of a string value,
the location FP0
+2 will contain the address of the string descriptor.
This means you can pass a string expression to USR()
:
FP0 equ $0050 GIVBF equ $B4F3 org $7F00 checksum ldx FP0 + 2 ; get string descriptor lda ,x ; get length ldx 2,x ; get pointer to data clrb ; clear checksum .sum addb ,x+ ; add in next character deca ; decrement length bne .sum ; continue if more data comb jmp GIVBF ; return checksum to BASIC end
Of course,
this routine assumes a string was correctly passed in.
If you do pass in a number to USR()
all you'll get is a nonsensical result.
It would be nice to do some error checking,
and while you could do something like:
VALTYP equ $0006 checksum tst VALTYP beq .error ... .error ldd #-1 jmp GIVABF
There are two functions I found
via the Unravelled Series
(a collection of books that give a source listing of the BASIC ROM contents—this is also where I found GIVBF
)
that can help with error checking.
They're not named in the Unravelled series
(they're just named after their memory address)
but I've come to call CHKNUM
(located at address $B143) to ensure the given parameter is a number,
and CHKSTR
(located at address $B146) to ensure the given parameter is a string.
If either function fails,
the function instead returns a TM
(type mismatch) error to BASIC and the program stops running.
So we can rewrite our checksum
function as:
FP0 equ $0050 CHKSTR equ $B146 GIVBF equ $B4F3 org $7F00 checksum jsr CHKSTR ; check parameter is a string ldx FP0 + 2 ; get string descriptor lda ,x ; get length ldx 2,x ; point to string data clrb ; clear checksum .sum addb ,x+ ; add in next character deca ; decrement length bne .sum ; continue if more data comb jmp GIVBF ; return checksum to BASIC end
This is nice,
but what if we want to return a new string?
This isn't so straightforward in plain Color BASIC.
Color BASIC expects a numeric result from USR()
,
and if we attempt to return a string,
we get an error.
So something like:
110 A$ = USR("SOME STRING")
is right out.
But not all is lost. We can modify the string descriptor. For example:
silly_example jsr CHKSTR ; just assume this is defined ldx FP0 + 2 ; and FP0, but get string descriptor ldb #.textlen ; new string length stb ,x ; save it ldd #.text ; get new text std 2,x ; point to it clrb ; and return a value to BASIC jmp GIVBF .text fcc /HELLO, WORLD!/ .textlen equ * - .text end
So, calling this with:
110 X$="THIS IS A STRING" 120 PRINT X$ 130 X=USR(X$) 140 PRINT X$
will return in:
THIS IS A STRING HELLO, WORLD!
And again, that's fine. But if you want to modify the passed in string? You could set aside memory for this. For example, to ROT-13 a string:
rot13 jsr CHKSTR ; ensure a string ldy FP0 + 2 ; get string descriptor ldb ,y ; get length ldx #buffer ; tmp space ldu 2,y ; get original string stx 2,y ; save pointer to new string in descriptor .loop lda ,u+ ; get character cmpa #'A' ; if < 'A', no processing blo .out cmpa #'Z' ; if > 'Z', no processing bhi .out adda #13 ; ROT-13 the character cmpa #'Z' bls .out suba #26 .out sta ,x+ ; save character in new string decb ; continue if more bne .loop jmp GIVBF ; return result to BASIC buffer rmb 255 ; maximum length of string end
But that will fail if you attempt to ROT-13 multiple strings at the same time.
A better way is to call RSVPSTR
(again, found on the Unravelled series and given a name my be and located at address $B56D)
which will reserve space from the dynamic string pool maintained by BASIC.
It expects the amount of space in the B
register,
and if it returns
(it can error out with an “OS” (out of string space) error),
it returns the length in the B
register,
and the space in the X
register.
So now our function looks like:
rot13 jsr CHKSTR ; ensure a string ldy FP0 + 2 ; get string descriptor ldb ,y ; get length jsr RSVPSTR ; reserve new string of said length ldu 2,y ; get original string stx 2,y ; save pointer to new string in descriptor .loop lda ,u+ ; get character cmpa #'A' ; if < 'A', no processing blo .out cmpa #'Z' ; if > 'Z', no processing bhi .out adda #13 ; ROT-13 the character cmpa #'Z' bls .out suba #26 .out sta ,x+ ; save character in new string decb ; continue if more bne .loop jmp GIVBF ; return result to BASIC buffer rmb 255 ; maximum length of string end
The only downside is that you have to use a string variable when calling the routine. You could give it a string literal:
100 X=USR("THIS IS A STRING")
While that won't crash, you won't have access to the newly created string either. Just something to keep in mind.
One other thing to keep in mind—don't change the actual string data itself,
for doing so will cause undefined results.
For instance,
if you call USR()
with a string literal:
110 X=USR("HELLO, WORLD!")
The pointer in the string descriptor points directly into the source code! So you can change the contents of the descriptor, but not the string itself.
Also to keep in mind,
when you call USR()
with a number,
you don't have to convert it to an integer.
You could call into some Color BASIC floating point routines if you know where they are.
So,
for example:
CHKNUM equ $B143 FNULx equ $BACA org $7F00 twopi jsr CHKNUM ; check for number input ldx #.pi jmp FMULx .pi .float 3.14159265358979323846 end
To aid in writing such code, I have written definitions for interfacing with Color BASIC and a file that points to floating point routines within BASIC. Note that these files assume you are using my assembler but it should be easy to adapt to other assemblers.
And that's pretty much it for calling an assembly language subroutine in plain Color BASIC. You can pass in numbers or strings, but you can only return numbers. And if you want a new string, you have to pass in a string variable. You are also restricted to only one such function. A fair start, but things get eaiser with Extended Color BASIC.