Monday, November 28, 2011
I haven't dont a metablog post in a while …
In addition to updating the greylist daemon, I've also updated the software that runs this blog.
The biggest change this time is to the configuration file. The first
stab at changing how it works was made back in (let me check … oh wow!
was it that long ago?) September of 2010. Prior, I had code that
checked the SCRIPT_FILENAME
environment variable (passed in by
Apache) and changed
the extention from .cgi
to .cnf
to locate the
file. That meant the configuration file had to live in the main web
directory and frankly, I felt that bit of code was always a bit of a
hack.
I changed that, however, by configuring Apache to pass the configuration filename explicitly to the script:
<VirtualHost 66.252.224.242:80> # ... <Files boston.cgi> SetEnv BLOG_CONFIG /home/spc/web/sites/boston.conman.org/journal/boston.cnf </Files> # ... </VirtualHost>
Now, no more hacking around with filenames, and the configuration file no
longer needs to be stored in a web-facing location. If you do use this
method, you'll need to check REDIRECT_BLOG_CONFIG
as well
(which Apache sets when it does a redirect, and only a redirect).
And that was it for the configuration file until earlier this month. The next big change is how it looks. Prior to the changes this month, the configuration file looked like:
Comment: Comment: ********************************************* Comment: * Comment: * Configure File for the Boston Diaries Comment: * Comment: ********************************************** Comment: Name: The Boston Diaries Backend: /home/spc/source/boston.old.1.9/sbg/bp BaseDir: /home/spc/web/sites/boston.conman.org/journal WebDir: /home/spc/web/sites/boston.conman.org/htdocs/ BaseUrl: / FullBaseUrl: http://boston.conman.org Templates: html/regular DayPage: /home/spc/web/sites/boston.conman.org/htdocs/index.html Days: 7 RssFile: /home/spc/web/sites/boston.conman.org/htdocs/bostondiaries.rss RssTemplates: rss RssFirst: latest AtomFile: /home/spc/web/sites/boston.conman.org/htdocs/index.atom AtomTemplates: atom Comment: TabTemplates: html/sidebar Comment: TabFile: /home/spc/web/sites/boston.conman.org/htdocs/boston.tab.html Comment: TabFirst: latest StartDate: 1999/12/4 Author: Sean Conner Comment: Authors: /home/spc/web/sites/boston.conman.org/users Email: sean@conman.org Email-List: /home/spc/web/sites/boston.conman.org/notify/db/email Email-Message: /home/spc/web/sites/boston.conman.org/notify/mail/notify Email-Subject: The Boston Diaries Update Notification Facebook-AP-ID: XXXXXXXXXXXXXXX Facebook-AP-Secret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Facebook-User: XXXXXX _System-CPU: 600 _System-Mem: 20971520 _System-Core: 0 _System-Locale: en_SPC
Let me explain how this came about. I had code to parse RFC-822 style headers (because at the time I had code to fetch pages via HTTP and it's needed there; also, I can accept entries via email and I need it there too) and instead of writing even more code to parse a configuration file, I decided to shoehorn the configuration file into an RFC-822 format.
And thus, the odd format for the configuration file. It's also never
been fully clensed of old features (I no longer have a backend, so the
Backend:
header could go; I removed support for the tab
template, so TabTemplates:
, TabFile:
and
TabFirst:
could go as well—don't bother asking what the tab
file was for, it'll take too long to explain and as far as I know, nobody,
including myself, ever bother using it).
Even since I started playing around with Lua, I've been playing around with the idea of using it as a configuration file, and I finally got around to doing it.
process = require("org.conman.process") os = require("os") -- --------------------------------------------------------------------- -- Custom locale to get "Debtember" without special code in the program -- --------------------------------------------------------------------- os.setlocale("en_SPC") -- -------------------------------------------------------------------- -- process limits added because an earlier version of the code actually -- crashed the server it was running on, due to resource exhaustion. -- -------------------------------------------------------------------- process.limits.hard.cpu = "10m" -- 10 minutes process.limits.hard.core = 0 -- no core file process.limits.hard.data = "20m" -- 20 MB -- -------------------------------------------------------- -- We now resume our regularly scheduled config file -- -------------------------------------------------------- name = "The Boston Diaries" basedir = "/home/spc/web/sites/boston.conman.org/journal" webdir = "/home/spc/web/sites/boston.conman.org/htdocs" url = "http://boston.conman.org/" author = { name = "Sean Conner" , email = "sean@conman.org" } startdate = "1999/12/4" templates = { { template = "html/regular", output = webdir .. "/index.html", items = "7days", reverse = true }, { template = "rss", output = webdir .. "/bostondiaries.rss", items = 15, reverse = true }, { template = "atom", output = webdir .. "/index.atom", items = 15, reverse = true } } email = { list = "/home/spc/web/sites/boston.conman.org/notify/db/email", message = "/home/spc/web/sites/boston.conman.org/notify/mail/notify", subject = name .. " Update Notification", } facebook = { ap_id = "XXXXXXXXXXXXXXX", ap_secret = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX", user = "XXXXXX" } affiliate = { { proto = "asin", link = "http://www.amazon.com/exec/obidos/ASIN/%s/conmanlaborat-20" } }
Not only does it look much nicer (Whitespace! Real comments!) but I was able to remove code to handle the resource limits (it's now handled in Lua—and I'll talk about that in another entry) and locales (which supports a feature I added back in October 2003).
Also, by doing this, I partially cleaned up the template mess. Before, I had to explicitely add code to support specialized templates (the HTML output, the RSS and ATOM feed files and the long-since-removed tab file); now, I can specify new templates by just adding them to the configuration file. The only limitation is that the HTML template has to be specified first (it's easier to code that way).
You'll also notice a section labeled affiliate
. That I
threw in at the last moment. I'm an Amazon affiliate and if I wanted to link
to, say, a book from my friend Hoade, I would have to manually generate the
link, but now, I can just do:
<a class="book" href="asin:0595095291">Ain't that America</a>
and it'll be converted automatically to the correct link:
<a class="book" href="http://www.amazon.com/exec/obidos/ASIN/http://www.amazon.com/exec/obidos/ASIN/0595095291/conmanlaborat-20">Ain't that America</a>
Or rather, Hoade's book Ain't that America.
On the down side, in trying to release this (the last releast was in September of 2009, and before that, July of 2004) I found a rather curious bug—below a certain threshhold of entries (and there're currently over 3,700 here in this blog), the program crashes. There's probably an assumption built into the code about there always being a previous entry, but for a new blog, that's not necessarily the case and in tracking down the issue, I found that it appears to have something to do with the internal caching I do of entries. Like the old joke goes:
There are only two hard problems in Computer Science: cache invalidation, naming things, and off-by-one errors.
And I think I'm being hit by one of—
Core error - bus dumped
Some Lua trickery
In my previous post, I presented this bit of Lua code:
process = require("org.conman.process") -- -------------------------------------------------------------------- -- process limits added because an earlier version of the code actually -- crashed the server it was running on, due to resource exhaustion. -- -------------------------------------------------------------------- process.limits.hard.cpu = "10m" -- 10 minutes process.limits.hard.core = 0 -- no core file process.limits.hard.data = "20m" -- 20 MB
It looks like a simple assignment to set process limits, yet under Unix,
you need to call setrlimit()
. What's happening under the hood
(so to speak) is that it's easy to intercept assignments to tables (Lua
“go-to” data structure) and that's exactly what's going on here. During
the process of registering the module org.conman.process
(more
on the name later) we create some fake structures for the hard limits (and
soft limits, but since it's similar, I'll skip that part) and attach a
metatable, which contains code to intercept both reads and writes so we can
do a bit of magic:
#define SYS_LIMIT_HARD "rlimit_hard" #define SYS_LIMIT_SOFT "rlimit_soft" static const struct luaL_reg mhlimit_reg[] = { { "__index" , mhlimitlua___index } , { "__newindex" , mhlimitlua___newindex } , { NULL , NULL } }; static const struct luaL_reg mslimit_reg[] = { { "__index" , mslimitlua___index } , { "__newindex" , mslimitlua___newindex } , { NULL , NULL } }; int luaopen_org_conman_process(lua_State *const L) { void *udata; assert(L != NULL); luaL_newmetatable(L,SYS_LIMIT_HARD); luaL_register(L,NULL,mhlimit_reg); luaL_newmetatable(L,SYS_LIMIT_SOFT); luaL_register(L,NULL,mslimit_reg); luaL_register(L,"org.conman.process",mprocess_reg); lua_createtable(L,0,2); udata = lua_newuserdata(L,sizeof(int)); luaL_getmetatable(L,SYS_LIMIT_HARD); lua_setmetatable(L,-2); lua_setfield(L,-2,"hard"); udata = lua_newuserdata(L,sizeof(int)); luaL_getmetatable(L,SYS_LIMIT_SOFT); lua_setmetatable(L,-2); lua_setfield(L,-2,"soft"); lua_setfield(L,-2,"limits"); return 1; }
When Lua sees an assignment to the process.limits.hard
table, it calls mhlimit_lua___newindex()
, where the magic
happens:
static int mhlimitlua___newindex(lua_State *const L) { struct rlimit limit; void *ud; const char *tkey; int key; lua_Integer ival; assert(L != NULL); ud = luaL_checkudata(L,1,SYS_LIMIT_HARD); tkey = luaL_checkstring(L,2); if (!mlimit_trans(&key,tkey)) return luaL_error(L,"Illegal limit resource: %s",tkey); if (lua_isnumber(L,3)) ival = lua_tointeger(L,3); else if (lua_isstring(L,3)) { const char *tval; const char *unit; tval = lua_tostring(L,3); ival = strtoul(tval,(char **)&unit,10); if (!mlimit_valid_suffix(&ival,key,unit)) return luaL_error(L,"Illegal suffix: %c",*unit); } else return luaL_error(L,"Non-supported type"); limit.rlim_cur = ival; limit.rlim_max = ival; setrlimit(key,&limit); return 0; }
We basically take the key we're given, say, “cpu”, and translate it to
the appropriate value (which happens in
mlimit_trans()
—nothing terribly interesting, it just maps the
string to the appropriate constant value, in this example,
RLIMIT_CPU
) and the same for the value; if it's a number, we'll
use that and if it's a string, we'll convert it to a value and use any
suffix to modify the value. For our example, “cpu”, it's a meaure of
time, so the suffix “m” means “minutes.”
mlimit_valid_suffix()
handles this and again, it's pretty
straightforward code.
I think it's a pretty cool trick, but I can see why some might not like the idea of masking what amounts to a system call with what looks like a simple assignment, since it does have side effects outside of the simple assignment, but I like the way it looks, and it's a more “natural” or even “Luaish” way of specifying the intent of the code.
Now, on to the name of the module, org.conman.process
. When
I first started playing around with Lua I wrote a few modules that did
similar operations as existing modules, with the same names. One example is
syslog
. There's an existing Lua syslog module, but I don't like
how it works, so I wrote my own.
The problem now becomes, what if I want to use a module that uses the
existing Lua syslog module, but the rest of my code uses mine? If they both
have the same name, some code is going to get a nasty surprise. To work
around that, I decided to put all my modules under a “namespace” I control
and is not likely to cause any conflicts with any existing (or even future)
modules. Thus, the org.conman
namespace.