The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Sunday, April 18, 2010

Off to the races

I mentioned briefly yesterday about the issue I was having with syslogintr while booting the computer. On my system, the system would hang just after loading syslogintr. I tracked it down to initlog hanging. Further investigation revealed that both syslogintr and initlog were hanging, but the significance of that escaped me until an epiphany I had while sleeping: I was experiencing yet another race condition!

A quick test today proved that yes, it was a race condition. A particularly nasty race condition too, since once again, I wasn't explicitly writing multi-threaded code.

syslogintr creates a local socket (/dev/log for those that who are curious) and then waits to receive logging messages sent to said socket, something like:

local = socket(...)

while(!interrupted)
{
  read(socket,buffer,sizeof(buffer));
  process(buffer);
}

But in the process of processing the incoming message, syslogintr may itself call syslog():

while(!interrupted)
{
  read(socket,buffer,sizeof(buffer));
  process(buffer);
  syslog(LOG_DEBUG,"processed message");
}

syslog() (which is part of the standard library under Unix) sends the message to the local socket (/dev/log). The data is queued up in the socket, but it's okay because it'll cycle around quickly enough to pick up the new data. Unless there's too much data already queued in the local socket, at which point whoever calls syslog() will block until the backlogged data in the local socket is dealt with.

The startup script (/etc/init.d/syslog for those of you following along at home) starts both syslogintr and klogd. klogd is a program that pulls the data from the kernel logging queue (logging messages the kernel itself generates, but the kernel can't use /dev/log as that's just a Unix convention, not something enforced by the kernel itself) and logs that data via syslog(). And by the time klogd starts up, there's quite a bit of logging data generated by the kernel. So that data gets blasted at syslogintr (and in the process, so much data is being sent that klogd is blocked from running). But syslogintr is still coming up to speed and generating a bunch of internal messages and suddenly, its calls to syslog() are blocking, thus causing a deadlock:

while(!interrupted)
{
  read(socket,buffer,sizeof(buffer));
  process(buffer);	/* this takes some time */

  /*--------------------------------------------------
  ; meanwhile, something like klogd could be blasting
  ; data to the local socket, filling it up, thus when
  ; we get to:
  ;---------------------------------------------------*/

  syslog(LOG_DEBUG,"processed message");

  /*-------------------------------------------------
  ; the call to syslog() blocks, thus blocking the
  ; program until something else (in this case, *us*)
  ; drains the data waiting in the socket.  But we
  ; can't drain the data because we're waiting (via
  ; syslog()) for the data to be drained!
  ;
  ; Can you say, *deadlock* boys and girls?  I knew
  ; you could.
  ;--------------------------------------------------*/
}

This also explains why it only happened when booting—because that's about the only time so much data is pushed to syslogintr that socket operations (reading, writing) are blocked. It also explains why I haven't seen it on any other system I'm running it on, since those systems don't run klogd (being virtual hosts, they don't have klogd).

If you've ever wondered why software tends to crash all the time, it's odd interactions like this that are the cause (and this was an easy problem to diagnose, all things considered).

So now I internally queue any logging messages and handle them in the main loop, something along the lines of:

while(!interrupted)
{
  foreach msg in queued_messages
    process(msg);

  read(socket,buffer,sizeof(buffer));
  process(buffer);
  queuelog(LOG_DEUBG,"processed message");
}

Obligatory Picture

An abstract representation of where you're coming from]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

No AI was used in the making of this site, unless otherwise noted.

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.