The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Friday, March 22, 2013

Preloading Lua modules

I'm tasked with testing the call processing on “Project: Wolowizard.” M suggested, and I concurred, that using Lua to manage the testing scripts would be a Good Thing™. Easier to write and modify the tests as needed. So over the past few years I've written a number of modules to handle the files and protocols used in the project (one side effect: by re-implemeting the code to read/write the various data files helped to verify the specification and flush out architectural dependencies in the binary formats).

But one problem did exist: Not all the systems I need to run the test on have Lua installed, and LuaRocks has … um … “issues” on our Solaris boxes (otherwise, it's not that bad a package manager). So I decided to build what I call “Kitchen Sink Lua”—a Lua interpreter that has the 47 modules required to run the testing scripts (okay, eight of the modules are already built into Lua).

It took some time to wrangle, as some of the modules were written in Lua (so the source needed to be embedded) and I had to figure out how to integrate some third party modules (like LuaCURL) into the build system, but perhaps the hardest bit was to ensure the modules were initialized properly. My first attempt, while it worked (mostly by accident) wasn't technically correct (as I realized when I read this message on a mailing list).

I then restructured my code, which not only made it correct, but smaller and clearer.

#include <stdlib.h>
#include <assert.h>

#include <lua.h>
#include <lauxlib.h>
#include <lualib.h>


typedef struct prelua_reg
  const char   *const name;
  const char   *const code;
  const size_t *const size;
} prelua_reg__t;


int	luaopen_org_conman_env		(lua_State *);
int	luaopen_org_conman_errno	(lua_State *);
int	luaopen_org_conman_fsys		(lua_State *);
int	luaopen_org_conman_math		(lua_State *);
int	luaopen_org_conman_syslog	(lua_State *);
int	luaopen_org_conman_hash		(lua_State *);
int	luaopen_org_conman_string_trim	(lua_State *);
int	luaopen_org_conman_string_wrap	(lua_State *);
int	luaopen_org_conman_string_remchar (lua_State *);
int	luaopen_org_conman_process	(lua_State *);
int	luaopen_org_conman_net		(lua_State *);
int	luaopen_org_conman_dns		(lua_State *);
int	luaopen_org_conman_sys		(lua_State *);
int	luaopen_org_conman_uuid		(lua_State *);
int	luaopen_lpeg			(lua_State *);
int	luaopen_LuaXML_lib		(lua_State *);
int	luaopen_cURL			(lua_State *);


	; Modules written in Lua.  The build system takes the Lua code,
	; processes it through luac (the Lua compiler), then creates an
	; object file which exports a character array containing the byte
	; code, and a variable which gives the size of the bytecode array.

extern const char   c_org_conman_debug[];
extern const size_t c_org_conman_debug_size;
extern const char   c_org_conman_getopt[];
extern const size_t c_org_conman_getopt_size;
extern const char   c_org_conman_string[];
extern const size_t c_org_conman_string_size;
extern const char   c_org_conman_table[];
extern const size_t c_org_conman_table_size;
extern const char   c_org_conman_unix[];
extern const size_t c_org_conman_unix_size;
extern const char   c_re[];
extern const size_t c_re_size;
extern const char   c_LuaXml[];
extern const size_t c_LuaXml_size;

	; Modules written in C.  We can use luaL_register() to load these
	; into package.preloaded[]

const luaL_Reg c_preload[] =
  { "org.conman.env"		, luaopen_org_conman_env		} ,
  { "org.conman.errno"		, luaopen_org_conman_errno		} ,
  { "org.conman.fsys"		, luaopen_org_conman_fsys		} ,
  { "org.conman.math"		, luaopen_org_conman_math		} ,
  { "org.conman.syslog"		, luaopen_org_conman_syslog		} ,
  { "org.conman.hash"		, luaopen_org_conman_hash		} ,
  { "org.conman.string.trim"	, luaopen_org_conman_string_trim	} ,
  { "org.conman.string.wrap"	, luaopen_org_conman_string_wrap	} ,
  { "org.conman.string.remchar"	, luaopen_org_conman_string_remchar	} ,
  { "org.conman.process"	, luaopen_org_conman_process		} ,
  { ""		, luaopen_org_conman_net		} ,
  { "org.conman.dns"		, luaopen_org_conman_dns		} ,
  { "org.conman.sys"		, luaopen_org_conman_sys		} ,
  { "org.conman.uuid"		, luaopen_org_conman_uuid		} ,
  { "lpeg"			, luaopen_lpeg				} ,
  { "LuaXML_lib"		, luaopen_LuaXML_lib			} ,
  { "cURL"			, luaopen_cURL				} ,
  { NULL			, NULL					}

	; Modules written in Lua.  These need to be loaded and populated
	; into package.preloaded[] by some code provided in this file.

const prelua_reg__t c_luapreload[] =
  { "org.conman.debug"		, c_org_conman_debug	, &c_org_conman_debug_size	} ,
  { "org.conman.getopt"		, c_org_conman_getopt	, &c_org_conman_getopt_size	} ,
  { "org.conman.string"		, c_org_conman_string	, &c_org_conman_string_size	} ,
  { "org.conman.table"		, c_org_conman_table	, &c_org_conman_table_size	} ,
  { "org.conman.unix"		, c_org_conman_unix	, &c_org_conman_unix_size	} ,
  { "re"			, c_re			, &c_re_size			} ,
  { "LuaXml"			, c_LuaXml		, &c_LuaXml_size		} ,
  { NULL			, NULL			, NULL				}


void preload_lua(lua_State *const L)
  assert(L != NULL);
  ; preload all the modules.  This does does not initialize them, 
  ; just makes them available for require().  
  ; I'm doing it this way because of a recent email on the LuaJIT
  ; email list:
  ; Pre-loading these modules in package.preload[] means that they're be
  ; initialized properly through the require() statement.
  for (size_t i = 0 ; c_luapreload[i].name != NULL ; i++)
    int rc = luaL_loadbuffer(L,c_luapreload[i].code,*c_luapreload[i].size,c_luapreload[i].name);
    if (rc != 0)
      const char *err;
        case LUA_ERRRUN:    err = "runtime error"; break;
        case LUA_ERRSYNTAX: err = "syntax error";  break;
        case LUA_ERRMEM:    err = "memory error";  break;
        case LUA_ERRERR:    err = "generic error"; break;
        case LUA_ERRFILE:   err = "file error";    break;
        default:            err = "unknown error"; break;
      fprintf(stderr,"%s: %s\n",c_luapreload[i].name,err);


Yes, this is the code used in “Project: Wolowizard” (minus the proprietary modules) and is a good example of the module preload feature in Lua. The modules in C are easy to build (the following is from the Makefile):

obj/spc/process.o : $(LUASPC)/src/process.c     \
                $(LUA)/lua.h                    \
        $(CC) $(CFLAGS) -I$(LUA) -c -o $@ $<

While the Lua-based modules are a bit more involved:

obj/spc/unix.o : $(LUASPC)/lua/unix.lua $(BIN2C) $(LUAC)
        $(LUAC) -o tmp/unix.out $<
        $(BIN2C) -o tmp/unix.c -t org_conman_unix tmp/unix.out
        $(CC) $(CFLAGS) -c -o $@ tmp/unix.c

These modules are compiled using luac (which outputs the Lua byte code used by the core Lua VM), then through a program that converts this output into a C file, which is then compiled into an object file that can be linked into the final Kitchen Sink Lua interpreter.

Musings on the Current Work Project Du jour

So I have this Lua code that implements the cellphone end of a protocol used in “Project: Wolowizard.” I need to ramp up the load testing on this portion of the project so I'm looking at what I have and trying to figure out how to approach this project.

The protocol itself is rather simple—only a few messages are defined and the code is rather straightforward. It looks something like:

-- Pre-define these
state_receive = function(phone,socket) end
state_msg1    = function(phone,socket,remote,msg) end
state_msg2    = function(phone,socket,remote,msg) end

-- Now the code

state_receive = function(phone,socket)
  local remote,packet,err = socket:read()
  if err ~= 0 then
    syslog('err',string.format("error reading socket: %s",errno[err]))
    return state_receive(phone,socket)

  local msg,err = sooperseekritprotocol.decode(packet)
  if err ~= 0 then
    syslog('err',string.format("error decoding: %s",decoderror(err))
    return state_receive(phone,socket)

  if msg.type == 'MSG1" then
    return state_msg1(phone,socket,remote,msg)
  elseif msg.type == "MSG2" then
    return state_msg2(phone,socket,remote,msg)
    syslog('warn',string.format("unknown message: %s",msg.type))
    return state_receive(phone,socket)

state_msg1 = function(phone,socket,remote,msg)
  local reply = ... -- code to handle this msg
  local packet = sooperseekritprotocol.encode(reply)
  return state_receive(phone,socket)

state_msg2 = function(phone,socket,remote,msg)
  local reply = ... -- code to andle this msg
  local packet = sooperseekritprotocol.encode(reply)
  return state_receive(phone,socket)

Don't worry about this code blowing out the call stack—Lua optimizes tail calls and these effectively become GOTOs. I found this feature to be very useful in writing protocol handlers since (in my opinion) it makes the state machine rather explicit.

Now, to speed this up, I could translate this to C. As I wrote the Lua modules for The Kitchen Sink Lua interpreter, I pretty much followed a bi-level approach. I have a C interface (to be used by C code) which is then mimicked in Lua. This makes translating the Lua code into C more or less straightforward (with a bit more typing because of variable declarations and what not).

But here, I can't rely on the C compiler to optimize tail calls (GCC can, but only with certain options; I don't know about the Solaris C compiler). I could have the routines return the next function to call and use a loop:

while((statef = (*statef)(phone,sock,&remote,&msg) != NULL)
  /* the whole state machine is run in the previous line;

But just try to define the type of statef so the compiler doesn't complain about a type mismatch. It needs to define a function that takes blah and returns a function that takes blah and returns a function that takes blah and returns a function that … It's one of those recurisive type definitions that produce headaches when you think too much about it.

Okay, so instead, let's just have a function that returns a simple integer value that represents the next state. That's easier to define and the main driving loop isn't that bad:

while(state != DONE)
    case RECEIVE: state = state_receive(phone,socket,&remote,&msg); break;
    case MSG1:    state = state_msg1(phone,socket,&remote,&msg); break;
    case MSG2:    state = state_msg2(phone,socket,&remote,&msg); break;
    default:      assert(0); break;

Okay, with that out of the way, we can start writing the C code.

Clackity-clackity-clack clackity-clack clack clack clackity-clackity-clackity-clack clack clack clack clack …

Man, that's boring drudgework. Okay, let's just use the Lua code and maybe throw some additional threads at this. I don't think that's a bad approach. Now, Lua, out of the box, isn't exactly thread-safe. Sure, you can provide an implemention of lua_lock() and lua_unlock() but that might slow Lua down quite a bit (there are 62 locations where the lock could be taken in the Lua engine). We could give each thread its own Lua state—how bad could that be?

How big is a Lua state? Let's find out, shall we?

#include <stdio.h>
#include <stdlib.h>
#include <lua.h>
#include <lauxlib.h>

int main(void)
  lua_State *L;

  L = luaL_newstate();
  if (L == NULL)
    return EXIT_FAILURE;
  printf("%d\n",lua_gc(L,LUA_GCCOUNT,0) * 1024);
  return EXIT_SUCCESS;

When compiled and run, this returns 2048, the amount of memory used in an empty Lua state. That's not bad at all, but that's an empty state. What about a more useful state, like the one you get when you run the stock Lua interpreter?

-- ensure any accumulated garbage is reclaimed
print(collectgarbage('count') * 1024)

Okay, when I run this, I get 17608. Eh … it's not that bad per thread (and I do have to remind myself—this is not running on my Color Computer with 16,384 bytes of memory). But I'm not running the stock Lua interpreter, I'm running the Kitchen Sink Lua with all the trimmings—how big is that state?

I run the above Lua code and I get 4683963.

Four and a half megs!


I suppose if it becomes an issue, I could always go back to writing C …

Obligatory Picture

[“I am NOT a number, I am … a Q-CODE!”]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site:, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.