I'm curious to test something and, odd as it may seem, the best way to do this (in my opinion) was to try using DynASM, the dynamic assembler used by LuaJIT (of which it is a part, but can be used separately from, LuaJIT). The official document is lacking somewhat, so I've been following a tutorial (along with the tutorial source code) for my own little project.
I will not be re-covering that ground here (that, and the The Unofficial DynASM Documentation should be enough to get you through using it if you are interested in it) but I will give a brief overview and my impressions of it.
DynASM is used to generate code, specified as assembly, at runtime, not at compile time. As such, you give the code you want to compile in your program thusly:
if (token.type == TOKEN_NUMBER) | mov ax,token.value else if (token.type == TOKEN_VARIABLE) | mov ax,[g_vars + token.value]
All this code does is generate different code depending on if the given token is a number or a variable. The DynASM statements themselves start with a “|” (which can lead to issues if you aren't expecting it) and in this case, it's the actual assembly code we want (more assembly code can be specified, but it's limited to one assembly statement per line). Once we have written our program, the C code needs to be run through a preprocessor (the actual DynASM program itself—written in Lua) and it will generate the proper code to generate the proper machine code:
if (token.type == TOKEN_NUMBER) //| mov ax,token.value dasm_put(Dst, 3, token.value); #line 273 "calc.dasc" else if (token.type == TOKEN_VALUE) //| mov ax,[g_vars + token.value] dasm_put(Dst, 7, g_vars + token.value);
The DynASM state data,
in this case,
can be specified with other DynASM directives in the code.
It's rather configurable.
You then link against the proper runtime code
(there are versions for
and add some broiler-plate code
(this is just an example of such code)
and there you go.
It's an intriguing approach, and the ability to specify normal looking assembly code is a definite plus. That you have to supply different code for different CPUs is … annoying but understandable (you can get around some of this with judicious use of macros and defines but there's only so much you can hide when at one extreme, you have a CPU with only eight registers and strict memory ordering and at the other end, CPUs with 32 registers and not-so-strict memory ordering). The other thing that really bites is the use of the “|” to denote DynASM statements. Yes, it can be worked around, but why couldn't Mike Pall (author of LuaJIT) have picked a symbol not used by C for this, like “@” or “$”? Unfortunately, it is what it is.
Overall, it's rather fun to play with, and it was pretty easy to use, spotty documentation notwithstanding.