The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Monday, November 28, 2011

I haven't dont a metablog post in a while …

In addition to updating the greylist daemon, I've also updated the software that runs this blog.

The biggest change this time is to the configuration file. The first stab at changing how it works was made back in (let me check … oh wow! was it that long ago?) September of 2010. Prior, I had code that checked the SCRIPT_FILENAME environment variable (passed in by Apache) and changed the extention from .cgi to .cnf to locate the file. That meant the configuration file had to live in the main web directory and frankly, I felt that bit of code was always a bit of a hack.

I changed that, however, by configuring Apache to pass the configuration filename explicitly to the script:

<VirtualHost 66.252.224.242:80>
  # ... 

  <Files boston.cgi>
        SetEnv  BLOG_CONFIG /home/spc/web/sites/boston.conman.org/journal/boston.cnf
  </Files>

  # ...
</VirtualHost>

Now, no more hacking around with filenames, and the configuration file no longer needs to be stored in a web-facing location. If you do use this method, you'll need to check REDIRECT_BLOG_CONFIG as well (which Apache sets when it does a redirect, and only a redirect).

And that was it for the configuration file until earlier this month. The next big change is how it looks. Prior to the changes this month, the configuration file looked like:

Comment:
Comment:        *********************************************
Comment:        *
Comment:        *       Configure File for the Boston Diaries
Comment:        *
Comment:        **********************************************
Comment:
Name:			The Boston Diaries
Backend:		/home/spc/source/boston.old.1.9/sbg/bp
BaseDir:		/home/spc/web/sites/boston.conman.org/journal
WebDir:         	/home/spc/web/sites/boston.conman.org/htdocs/
BaseUrl:        	/
FullBaseUrl:    	http://boston.conman.org
Templates:      	html/regular
DayPage:		/home/spc/web/sites/boston.conman.org/htdocs/index.html
Days:           	7
RssFile:        	/home/spc/web/sites/boston.conman.org/htdocs/bostondiaries.rss
RssTemplates:   	rss
RssFirst:		latest
AtomFile:		/home/spc/web/sites/boston.conman.org/htdocs/index.atom
AtomTemplates:		atom
Comment: TabTemplates:	html/sidebar
Comment: TabFile:	/home/spc/web/sites/boston.conman.org/htdocs/boston.tab.html
Comment: TabFirst:	latest
StartDate:      	1999/12/4
Author:         	Sean Conner
Comment: Authors:	/home/spc/web/sites/boston.conman.org/users
Email:			sean@conman.org
Email-List:		/home/spc/web/sites/boston.conman.org/notify/db/email
Email-Message:		/home/spc/web/sites/boston.conman.org/notify/mail/notify
Email-Subject:		The Boston Diaries Update Notification
Facebook-AP-ID:		XXXXXXXXXXXXXXX
Facebook-AP-Secret:	XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Facebook-User:		XXXXXX
_System-CPU:		600
_System-Mem:		20971520
_System-Core:		0
_System-Locale:		en_SPC

Let me explain how this came about. I had code to parse RFC-822 style headers (because at the time I had code to fetch pages via HTTP and it's needed there; also, I can accept entries via email and I need it there too) and instead of writing even more code to parse a configuration file, I decided to shoehorn the configuration file into an RFC-822 format.

And thus, the odd format for the configuration file. It's also never been fully clensed of old features (I no longer have a backend, so the Backend: header could go; I removed support for the tab template, so TabTemplates:, TabFile: and TabFirst: could go as well—don't bother asking what the tab file was for, it'll take too long to explain and as far as I know, nobody, including myself, ever bother using it).

Even since I started playing around with Lua, I've been playing around with the idea of using it as a configuration file, and I finally got around to doing it.

process = require("org.conman.process")
os      = require("os")

-- ---------------------------------------------------------------------
-- Custom locale to get "Debtember" without special code in the program
-- ---------------------------------------------------------------------

os.setlocale("en_SPC")

-- --------------------------------------------------------------------
-- process limits added because an earlier version of the code actually
-- crashed the server it was running on, due to resource exhaustion.
-- --------------------------------------------------------------------

process.limits.hard.cpu  = "10m"	-- 10 minutes
process.limits.hard.core =  0		-- no core file
process.limits.hard.data = "20m"	-- 20 MB

-- --------------------------------------------------------
-- We now resume our regularly scheduled config file
-- --------------------------------------------------------

name      = "The Boston Diaries"
basedir   = "/home/spc/web/sites/boston.conman.org/journal"
webdir    = "/home/spc/web/sites/boston.conman.org/htdocs"
url       = "http://boston.conman.org/"
author    = { name = "Sean Conner" , email = "sean@conman.org" }
startdate = "1999/12/4"

templates =
{
  {
    template = "html/regular",
    output   = webdir .. "/index.html",
    items    = "7days",
    reverse  = true
  },
  
  {
    template = "rss",
    output   = webdir .. "/bostondiaries.rss",
    items    = 15,
    reverse  = true
  },
  
  {
    template = "atom",
    output   = webdir .. "/index.atom",
    items    = 15,
    reverse  = true
  }
}

email =
{
  list    = "/home/spc/web/sites/boston.conman.org/notify/db/email",
  message = "/home/spc/web/sites/boston.conman.org/notify/mail/notify",
  subject = name .. " Update Notification",
}

facebook =
{
  ap_id     = "XXXXXXXXXXXXXXX",
  ap_secret = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
  user      = "XXXXXX"
}

affiliate =
{
  {
    proto = "asin",
    link  = "http://www.amazon.com/exec/obidos/ASIN/%s/conmanlaborat-20"
  }
}

Not only does it look much nicer (Whitespace! Real comments!) but I was able to remove code to handle the resource limits (it's now handled in Lua—and I'll talk about that in another entry) and locales (which supports a feature I added back in October 2003).

Also, by doing this, I partially cleaned up the template mess. Before, I had to explicitely add code to support specialized templates (the HTML output, the RSS and ATOM feed files and the long-since-removed tab file); now, I can specify new templates by just adding them to the configuration file. The only limitation is that the HTML template has to be specified first (it's easier to code that way).

You'll also notice a section labeled affiliate. That I threw in at the last moment. I'm an Amazon affiliate and if I wanted to link to, say, a book from my friend Hoade, I would have to manually generate the link, but now, I can just do:

<a class="book" href="asin:0595095291">Ain't that America</a>

and it'll be converted automatically to the correct link:

<a class="book"
href="http://www.amazon.com/exec/obidos/ASIN/http://www.amazon.com/exec/obidos/ASIN/0595095291/conmanlaborat-20">Ain't
that America</a>

Or rather, Hoade's book Ain't that America.

On the down side, in trying to release this (the last releast was in September of 2009, and before that, July of 2004) I found a rather curious bug—below a certain threshhold of entries (and there're currently over 3,700 here in this blog), the program crashes. There's probably an assumption built into the code about there always being a previous entry, but for a new blog, that's not necessarily the case and in tracking down the issue, I found that it appears to have something to do with the internal caching I do of entries. Like the old joke goes:

There are only two hard problems in Computer Science: cache invalidation, naming things, and off-by-one errors.

And I think I'm being hit by one of—
Core error - bus dumped


Some Lua trickery

In my previous post, I presented this bit of Lua code:

process = require("org.conman.process")

-- --------------------------------------------------------------------
-- process limits added because an earlier version of the code actually
-- crashed the server it was running on, due to resource exhaustion.
-- --------------------------------------------------------------------

process.limits.hard.cpu  = "10m"	-- 10 minutes
process.limits.hard.core =  0		-- no core file
process.limits.hard.data = "20m"	-- 20 MB

It looks like a simple assignment to set process limits, yet under Unix, you need to call setrlimit(). What's happening under the hood (so to speak) is that it's easy to intercept assignments to tables (Lua “go-to” data structure) and that's exactly what's going on here. During the process of registering the module org.conman.process (more on the name later) we create some fake structures for the hard limits (and soft limits, but since it's similar, I'll skip that part) and attach a metatable, which contains code to intercept both reads and writes so we can do a bit of magic:

#define SYS_LIMIT_HARD	"rlimit_hard"
#define SYS_LIMIT_SOFT	"rlimit_soft"

static const struct luaL_reg mhlimit_reg[] =
{
  { "__index" 		, mhlimitlua___index	} ,
  { "__newindex"	, mhlimitlua___newindex	} ,
  { NULL		, NULL			}
};

static const struct luaL_reg mslimit_reg[] =
{
  { "__index"		, mslimitlua___index	} ,
  { "__newindex"	, mslimitlua___newindex	} ,
  { NULL		, NULL			}
};

int luaopen_org_conman_process(lua_State *const L)
{
  void *udata;
  
  assert(L != NULL);
  
  luaL_newmetatable(L,SYS_LIMIT_HARD);
  luaL_register(L,NULL,mhlimit_reg);
  
  luaL_newmetatable(L,SYS_LIMIT_SOFT);
  luaL_register(L,NULL,mslimit_reg);
  
  luaL_register(L,"org.conman.process",mprocess_reg);
  lua_createtable(L,0,2);
  
  udata = lua_newuserdata(L,sizeof(int));
  luaL_getmetatable(L,SYS_LIMIT_HARD);
  lua_setmetatable(L,-2);
  lua_setfield(L,-2,"hard");
  
  udata = lua_newuserdata(L,sizeof(int));
  luaL_getmetatable(L,SYS_LIMIT_SOFT);
  lua_setmetatable(L,-2);
  lua_setfield(L,-2,"soft");
  
  lua_setfield(L,-2,"limits");  
  return 1;
}

When Lua sees an assignment to the process.limits.hard table, it calls mhlimit_lua___newindex(), where the magic happens:

static int mhlimitlua___newindex(lua_State *const L)
{
  struct rlimit  limit;
  void          *ud;
  const char    *tkey;
  int            key;
  lua_Integer    ival;

  assert(L != NULL);
  
  ud   = luaL_checkudata(L,1,SYS_LIMIT_HARD);
  tkey = luaL_checkstring(L,2);
  
  if (!mlimit_trans(&key,tkey))
    return luaL_error(L,"Illegal limit resource: %s",tkey);

  if (lua_isnumber(L,3))
    ival = lua_tointeger(L,3);
  else if (lua_isstring(L,3))
  {
    const char *tval;
    const char *unit;
    
    tval = lua_tostring(L,3);
    ival = strtoul(tval,(char **)&unit,10);

    if (!mlimit_valid_suffix(&ival,key,unit))
      return luaL_error(L,"Illegal suffix: %c",*unit);
  } 
  else
    return luaL_error(L,"Non-supported type");

  limit.rlim_cur = ival;
  limit.rlim_max = ival;
  
  setrlimit(key,&limit);
  return 0;
}

We basically take the key we're given, say, “cpu”, and translate it to the appropriate value (which happens in mlimit_trans()—nothing terribly interesting, it just maps the string to the appropriate constant value, in this example, RLIMIT_CPU) and the same for the value; if it's a number, we'll use that and if it's a string, we'll convert it to a value and use any suffix to modify the value. For our example, “cpu”, it's a meaure of time, so the suffix “m” means “minutes.” mlimit_valid_suffix() handles this and again, it's pretty straightforward code.

I think it's a pretty cool trick, but I can see why some might not like the idea of masking what amounts to a system call with what looks like a simple assignment, since it does have side effects outside of the simple assignment, but I like the way it looks, and it's a more “natural” or even “Luaish” way of specifying the intent of the code.

Now, on to the name of the module, org.conman.process. When I first started playing around with Lua I wrote a few modules that did similar operations as existing modules, with the same names. One example is syslog. There's an existing Lua syslog module, but I don't like how it works, so I wrote my own.

The problem now becomes, what if I want to use a module that uses the existing Lua syslog module, but the rest of my code uses mine? If they both have the same name, some code is going to get a nasty surprise. To work around that, I decided to put all my modules under a “namespace” I control and is not likely to cause any conflicts with any existing (or even future) modules. Thus, the org.conman namespace.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.