The Boston Diaries

Wednesday, June 04, 2025

The basics of an indirect threaded code ANS Forth implementation

I need to get into the habit of writing prose more often.

Before I go into the implementation of my ANS Forth system, I need to define somethings. First, a bit about Forth terminology—a Forth “word” can be thought of as a function or subroutine of other languages, but it's a bit more than that—it can also refer to a variable, a constant, the equivalent of a key word. It's a fluid concept, but thinking of a “word” as a function won't be too far off. Also, a collection of Forth words is collected into a wordset (or if you are reading older documentation about Forth, a “dictionary”). You can think of a “dictionary” as a library, or maybe a module, of code.

Second, Forth is a stack-based language. There are rarely any explicit parameters, just data placed on a stack, known as the “data stack.” Because of this, expressions are written in Reverse Polish Notation (RPN)—data is specified first, then the operator.

Third, there's a second stack, the “return stack” that is pretty much what it sounds like—it records the return address when calling into a Forth word.

Forth, the langauge is ridiculously easy to parse—a token is just a collection of characters separated by space. So not only is my_variable_name a valid name for a Forth word, so is my-variable-name, #array, 23-skidoo and even 3.14 valid names for Forth words. Yes, that means it's easy to do stupid things like define the token “1” to be 2 (you know, for when 1 equals two for large values of 1). But this also makes Forth trivial to parse. In fact, a Forth interpreter is nothing more than:

begin
	name = parse_token()
	word = lookup(name)
	if (word)
		execute(word)
	else
	{
		if (valid_number(name,current_base))
			push_number(convert(name,current_base))
		else
			error()
	}
repeat

That's it. That's a Forth interpreter more or less. Compiling Forth isn't much harder:

begin
	name = parse_token()
	word = lookup(name)
	if (word)
	{
		if (immediate(word))
			execute(word)
		else
			compile_into_definition(word)
	}
	else
	{
		if (valid_number(name,current_base))
			add_code_to_push(convert(name,current_base))
		else
			error()
	}
repeat

A Forth word can be marked as “immediate,” which just means the word is executed during compilation mode rather than being compiled, and this is how the Forth compiler can be extended. A Forth word like IF or BEGIN is just another Forth word, albeit marked as “immediate” so it can do its job when compiling.

The details about switching between interpreting and compiling will be covered in a later post, but it's not a difficult as it may seem.

And one bit about ANS Forth in particular—the standard defines a collection of “wordsets,” most of which are optional in an implementation. The minimum required is the Core Word set. The rest, including the Core Extension words, are optional. I did not implement all the wordsets, but again, more on that later.

Anyway, my ANS Forth system is a classic indirect threaded code (ITC) implementation. As I mentioned, it's easy to implement but perhaps not the fastest of implementation styles. As this is on the MC6809, I used a typical-for-the-6809 register use for my Forth system:

Register	Type	Forth usage
`D`	16-bit accumulator	Free for use (top of stack is kept in memory)
`X`	16-bit index register	execution token of word being run, free for use
`Y`	16-bit index register	Forth IP register
`U`	16-bit index register/user stack	data stack pointer
`S`	16-bit index register/system stack	return stack pointer

I elected to keep the top of stack in memory and not in the D register. I'm not sure if this effects the speed any, but it was easier, implementation wise, to keep the top of stack always in memory.

The Forth words in my implementation are either primitives, that is, written in assembly language, or secondaries, that is, written in Forth. Here's the format of a primitive word, +:

forth_core_plus                 ;  n1|u2 n2|u2 -- n3|u3 )
                fdb     forth_core_star_slash_mod	; previous word in dictionary
                fdb     .xt - .name			; length of name
.name           fcc     "+"				; name
.xt             fdb     .body				; execution token of word
.body           ldd     ,u++				; body of word
                addd    ,u
                std     ,u
                ldx     ,y++
                jmp     [,x]

The first thing to notice is the label. Given how difficult naming is in Computer Science, I decided to use the canonical name of each Forth Word as defined by the standard. All the labels start with “forth_,” and are followed by the wordset (in this case, “core_”) followed by the actual Forth word. Yes, this makes for some long labels, but I don't have to think about naming, which is nice. The .xt label (which is a “local label”—this defines the full label of forth_core_plus.xt) defines the “execution token” (xt) which is defined as “a value that identifies the execution semantics of a definition.” Here, this is a pointer to a function that provides the execution semantics of the word and in this case, just points to the “body” of the Forth word, which adds the top two elements of the data stack.

Of particular note are the last two instructions:

	ldx	,y++
	jmp	[,x]

Get used to seeing this fragment, it's used a lot—this is used to execute the next word in the program. As stated above, the Y register is the Forth IP, and this loads the X register with the xt of the next word, then jumps to the code that handles this word. The [,X] bit informs the CPU that we are jumping through a function pointer and not directly to the code (if we did that, JMP ,X, that would turn this from “indirect threaded code” to “direct threaded code”—and yes, that is the only difference between an ITC and DTC implementation).

The overall format of a word is a link to the previous word in the dictionary (here to */MOD), followed by a 16-bit length, the text that makes up the word, followed by the xt and then the body of the definition. You might be wondering why I would use a full 16-bits for the length on an 8-bit CPU—wouldn't that waste space? Yes, it would, especially given that in this implementation, a Forth word is restricted to just 31 characters, but I need a way to mark some information about each word, like if it's an immediate word or not. And while an 8-bit length where the largest value would be 31 giving me three bits for flag values, I ended up needed a few more than just three. So 16 bits for the length gives me a potential of 11 bits to use as flags.

A word written in Forth will have a different xt and body, for example, the very next word in the dictionary:

forth_core_plus_store           ; ( n|u a-addr -- )
                fdb     forth_core_plus
                fdb     .xt - .name
.name           fcc     "+!"
.xt             fdb     forth_core_colon.runtime
        ;===============================
        ; : +!  DUP @ ROT + SWAP ! ;
        ;===============================
                fdb     forth_core_dupe.xt
                fdb     forth_core_fetch.xt
                fdb     forth_core_rote.xt
                fdb     forth_core_plus.xt
                fdb     forth_core_swap.xt
                fdb     forth_core_store.xt
                fdb     forth_core_exit.xt

The xt here points to the label forth_core_colon.runtime. This is usually called DOCOLON but I'm being explicit here—Forth words defined in Forth are created by the Forth word :, and this label, forth_core_colon.runtime, implements the runtime portion of said word. The body of this word is then an array of execution tokens of various Forth words comprising the definition. The last word of all Forth words defined this way end with a call to corth_core_exit.xt.

The : runtime function looks like:

forth_core_colon		; ( C: "<spaces>name" -- colon-sys ) E ( i*x -- j*x )
		fdb	forth_core_two_swap
		fdb	.xt - .name
.name		fcc	":"
.xt		fdb	.body
.body		...		; I'll get to this bit of the code in a later post

.runtime	pshs	y
		leay	2,x
		ldx	,y++
		jmp	[,x]

Here, the runtime will push the Forth IP (which is the Y register) onto the return stack, set the Y register to the body of the word being executed and that two instruction sequence that goes to the next word to execute.

And the function EXIT looks like:

forth_core_exit			; E ( -- ) ( R: nest-sys -- )
		fdb	forth_core_execute
		fdb	_NOINTERP :: .xt - .name
.name		fcc	"EXIT"
.xt		fdb	.body
.body		puls	y	; restore the Forth IP
		ldx	,y++	; and execute next word
		jmp	[,x]

The first thing to notice here is the _NOINTERP flag. EXIT is defined as having no interpretation semantics, so typing EXIT outside of a word being defined is meaningless. Yes, this flag is used, and it does generate an error, but that again, is a later post. I should also mention that the :: here is a special operator in my assembler. The left hand side defines the most significant byte of a 16-bit quantity, and the right hand side defines the least significant byte. It's short hand for _NOINTERP * 256 + (.xt - .name).

The second thing to notice is that the Forth IP (again, the Y register) is pulled from the return stack, and we yet again, run that two instruction sequence to run the next word.

And this is pretty much the entire execution engine of Forth.

In fact, I wrote this bit first, even before writing code to compile Forth (and yes, I hand-compiled all Forth code, so I had something to compare against when I eventually got to compiling).

So on the one hand, Forth is an easy language to implement and can be quite small. On the other hand, ANS Forth has some subtle semantics that make for some … interesting implementation details and isn't that easy to implement, as we shall see over the coming posts.

Wednesday, June 04, 2025

The basics of an indirect threaded code ANS Forth implementation

Discussions about this entry

Obligatory Picture

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer