Tuesday, August 24, 2021
The case of the regression test regression, part II
When you have eliminated the impossible, whatever remains, however improbable, must be the truth.
Sherlock Holmes
When last I left off I identified the slow code in the regression test and it left me puzzled—it was a single function call that did not change between versions.
Now,
a bit of background: eight years ago
[Eight years⁈ Where did the time go? —Sean]
[A world wide pandemic. —Editor]
[Gee, thanks. —Sean]
I wrote a custom Lua interpreter that contains all possible Lua modules we could possibly use at work in order to avoid having a bunch of code to install,
which I call kslua
(which stands for “Kitchen Sink Lua”).
And so far,
that's what I've been using to run the regression test.
Faced with the fact that the sipsock:recv()
call was taking upwards of a second,
I decided update just that module to the latest in the fast version of the regression test as a sanity check.
Well,
it failed as a sanity check,
because the latest version of that module that contains that function ran fast,
so my sanity wasn't saved one bit.
The only conclusion I can come to is that something else has changed!
Fortunately,
somethine else has changed.
A bit more background: the regression test is used to test “Project: Sippy-Cup,”
“Project: Lumbergh” and “Project: Cleese.”
And to run those programs,
I need a few more programs that those programs communicate with,
and oh hey!
There's a program that the regression program runs that also runs via kslua
!
And through a tedious process of elimination,
I finally found a module that causes the slowdown—the network event driver module I wrote.
I then went through a tedious process of elminiation to find the exact change that causes the slow down.
The “fast” version of the function in question,
which is written in C,
is:
static int polllua_insert(lua_State *L) { pollset__t *set = luaL_checkudata(L,1,TYPE_POLL); int fh = luaL_checkinteger(L,2); lua_settop(L,4); if (set->idx == set->max) /* ... */
and the slow version, which is the next literal version of the code:
static int polllua_insert(lua_State *L) { pollset__t *set = luaL_checkudata(L,1,TYPE_POLL); int fh; lua_settop(L,4); if (!luaL_callmeta(L,2,"_tofd")) { lua_pushinteger(L,EINVAL); return 1; } fh = luaL_checkinteger(L,-1); if (set->idx == set->max) /* ... */
I got tired of having to write (in Lua):
SOCKETS:insert(sock:_tofd(),'r',handler)
so I changed the code to call _tofd()
directly:
SOCKETS:insert(sock,'r',handler) -- the system will know to call _tofd()
The only thing is—the program that calls this only calls this once in the program.
At startup.
Desk, meet head.
So I'm again failing to see how this causes the slowdown. I use the “fast” version and the regression runs fast. I click the version of that module one step forward and it's slow.
It's maddening!