Tuesday, March 26, 2013
I wonder what IPO I'll be invited to this time?
I need to check what I used as a certain field in my Lua unix module so I thought I would do this through the Lua interpreter:
[spc]saltmine:~>lua Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio > unix = require "org.conman.unix" lua: src/env.c:54: luaopen_org_conman_env: Assertion `value != ((void *)0)' failed. Aborted (core dumped) [spc]saltmine:~>
What the … um … what's going on with that code?
int luaopen_org_conman_env(lua_State *L) { luaL_register(L,"org.conman.env",env); for (int i = 0 ; environ[i] != NULL ; i++) { char *value; char *eos; value = memchr(environ[i],'=',(size_t)-1); assert(value != NULL); eos = memchr(value + 1,'\0',(size_t)-1); assert(eos != NULL); lua_pushlstring(L,environ[i],(size_t)(value - environ[i])); lua_pushlstring(L,value + 1,(size_t)(eos - (value + 1))); lua_settable(L,-3); } return 1; }
No! It can't be! Really?
value = memchr(environ[i],'=',10000);
[spc]saltmine:~>lua Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio > unix = require "org.conman.unix" >
Yup. It can be! Really!
XXXX! I encountered this very same bug fifteen years ago! The GNU C library, 64 bit version.
Back then, the maintainers of the GNU were making an assumption that any value above some
already ridiculously large value was obviously bad and returning
NULL
, not even bothering to run memchr()
. But I
was using a valid value.
You see, I have a block of data I know an equal sign exists in.
If it doesn't exist, I have bigger things to worry about (like I'm not in
Kansas a POSIX environment anymore). But I don't know
how much data to look through. And instead of just assuming a
“large enough value” (which may be good enough for today, but then again,
640K was enough back in the day) I
decided to use a value, that converted to a size_t
type,
basically translates to “all of memory”.
And on a 32-bit system, it worked fine. But on the GNU C library, 64-bit version, it failed, probably because the maintainers felt that 18,446,744,073,709,551,615 bytes is just a tad silly to search through.
And the only reason I remember this particular bug, is
because it apparently was enough to get me invited to the RedHat
IPO (it was either
that, or my work on porting pfe
to IRIX back in the mid-90s).
I did a bit more research (basically—I tried two 64-bit Linux
distributions) and I found a really odd thing—glibc
version
2.3 does not exhibit the behavior (meaning, my code works on a
version released in 2007) but crashes under 2.12 (the code changed sometime
between 2007 and 2010).
Sigh. Time to investigate if this is still a problem in 2.17 and if so, report it as a bug …