The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Tuesday, February 09, 2010

Syslogintr—a syslogd replacement in C and Lua

So, why did I decide to write my own syslogd? Well, it was started as something to do while at Bunny's mom's house. I was curious as to the actual protocol used by syslogd because it's always bugged me that the files written by syslogd bear no information about the facility or priority a message originally had. Sure, you can filter by the facility and priority, but the resulting files lose that information.

It was after reading RFC-3164 that I realized it might be possible to filter on more than just the facility and priority. Have Lua handle the logic of log analyzation and I had what I thought was a fun little project.

Heh. Little did I know.

Anyway, it's been in use for a little over three months (and 140 commits to the source repository) and I must say, I'm finding it very useful. Sure, I could have written a Lua script to parse the log files that the traditional syslogd writes, but I'm just cutting out the middleman filesystem.

So the C code accepts, parses and hands a well-structured block of information to the Lua code, and it's the Lua code where all the features are implemented.

The information to Lua is passed in a table (which is Lua for an associative array, or hash table) with the following fields:

Formatted syslog message as passed to Lua
nametypecomments
nametypecomments
versionnumberThis is the version of the syslog protocol, of which there are two—the documented version 1 which I haven't seen in the wild yet, and the conventional version which I've internally labelled as “version 0”. I currently only support version 0, so this field (as of this writing) is always 0.
_RAWstringThis is the actual raw message as received. It was intended as a debugging aid, but I keep it enabled as it is useful when you get the odd message.
remotebooleantrue if the message came in over the network, false if it was generated locally on the machine. I use this as a quick test.
hoststringThis is really the source of the message. For a remote syslog message, this will contain the IPv4 or IPv6 address. For messages generated locally, this will be the “local socket” (formerly known as “Unix sockets”) the message was written to (which on most systems is /dev/log).
relaystringUsually this will be the same as the host field, but when syslog messages are being relayed, this is the intermediate system that relayed the message on. If system A generated a message to B, and B relayed it to C, then the host field will contain “A”, but the relay field will contain “B”. If, however, it goes A → B → C → D, then host will contain “A” and relay will have “C” (in other words—the relay field will only contain the last relay machine).
portnumberThe UDP port number the syslog message was sent from. If the message was sent locally, this will be -1.
timestampnumberThis is the timestamp the message as it was received on the accepting host. This value is suitable for use in the Lua functions os.date() and os.difftime().
logtimestampnumberThis is the timestamp as found in the syslog message (if there was one). Otherwise, it will be the same as timestamp.
programstringThis is really the tag portion, but since it's mostly the name of the program that generated the syslog message, that's what I called this field. If it wasn't given, the value of this field will be “”' (an empty string, not nil).
pidnumberThe process ID of the program that sent the message. Most Unix syslogd implementations have this as part of the tag, and if found, is set here. Otherwise, it will be 0.
facilitystringThe facility, which will be one of the following:
  • kernel
  • user
  • mail
  • daemon
  • auth1
  • syslog
  • lpr
  • news
  • uucp
  • cron1
  • auth2 (also known as “authpriv”)
  • ftp
  • ntp
  • auth3
  • auth4
  • cron2
  • local0 through local7
levelstringThis is the priority of the message, but the term “priority” never made much sense, so I call it “level.” This will be one of eight values:
  • emerg
  • alert
  • crit
  • err
  • warn
  • notice
  • info
  • debug
msgstringThe actual message, or basically, whatever is left after parsing everything else. This is pretty much a free-format string.

This table is passed to a function called “log()” to handle however. On my development system, this function is simply:

function log(msg)
  writelog(msg)
  sshd(msg)
end

writelog() just logs the message to a single huge logfile (that records the host, program, facility, level and msg fields). It's the sshd() function that's interesting:

if blocked == nil then
  blocked = {}
end

function sshd(msg)
  if msg.remote   == true    then return end
  if msg.program  ~= "sshd"  then return end
  if msg.facility ~= "auth2" then return end
  if msg.level    ~= "info"  then return end

  local ip = string.match(msg.msg,"^Failed password for .* from ::ffff:([%d%.]+) .*");
  if ip == nil then return end

  I_log("debug","Found IP:" .. ip)

  if blocked[ip] == nil then
    blocked[ip] = 1
  else
    blocked[ip] = blocked[ip] + 1
  end

  if blocked[ip] == 5 then
    local cmd = "iptables --table filter --append INPUT --source " .. ip .. " --proto tcp --dport 22 --jump REJECT"
    I_log("debug","Command to block: " .. cmd)    
    os.execute(cmd)    
    I_log("info","Blocked " .. ip .. " from SSH")
    table.insert(blocked,{ ip = ip , when = msg.timestamp} )
  end
end

This checks for local messages from sshd and if it finds five (or more) consecutive failed attempts, adds the IP address of the offending party to the firewall, and logs the attempt (via I_log()). Nothing that a lot of intrusion scripts don't already do, but this is in the syslog daemon, not another process reading this second hand through a file (and without the issues that come up with log rotation).

Now, in order to keep the firewall from filling up, I added yet another feature—the ability to periodically run a Lua function. So, elsewhere in the script I have:

alarm("60m")

function alarm_handler()
  I_log("debug","Alarm clock");
  if #blocked == 0 then
    I_log("debug","Alarm clock---snooze button!")
    return
  end

  local now = os.time()

  I_log("debug",string.format("About to remove blocks (%d left)",#blocked))

  while #blocked > 0 do
    if now - blocked[1].when < 3600 then return end
    local ip = blocked[1].ip
    I_log("info","Removing IP block: " .. ip)      	
    blocked[ip] = nil
    table.remove(blocked,1)
    os.execute("iptables --table filter -D INPUT 1")
  end
end

alarm() informs the C code to call the function alarm_handler() every 60 minutes (if you give alarm() a numeric value, it takes that as a number of seconds between calling the function, if a string, it expects a numeric value followed immediately by “m” for minutes, “h” for hours (so I could have specified “1h” in this case) and “d” for days). The function alarm_handler() then removes the blocks one at a time (once per hour) which is long enough for the scanner to move on.

Now, you may have noticed this bit of odd code:

if blocked == nil then
  blocked = {}
end

One feature of the C code is that if it receives a SIGUSR1, the Lua script will be reloaded. This is to ensure that if there is any IP blocks defined, I don't lose any when the script is reloaded (and yes, that is a handy feature—change the Lua code a bit, and have the daemon reload the script without having to restart the entire daemon).

Along those lines, if the C code receives a SIGHUP, calls the Lua function reload_signal(), which allows the script the ability to close and reopen any logfiles it might be using (thus being compatible with the current syslogd behavior) which is useful for rotating the logs and what not.

I'm also running this on my own server (brevard.conman.org, which runs this blog) and also one of our monitoring servers at The Company (this one not only runs Nagios, Cacti and snmptrapd, but all our routers send their syslog information to this server). The Lua code I have running on the monitoring server not only keeps logging in the same files as the old syslogd (in the same format as well), but also this little bit of code:

function check_ospf(msg)
  if msg.facility ~= 'local1' then
    return
  end

  if string.match(msg.msg,".*(OSPF%-5%-ADJCHG.*Neighbor Down).*") then
    send_emergency_email("sean@conman.org",msg)
    send_emergency_email("XXXXXXXXXXXXXXXXX",msg)
  elseif string.match(msg.msg,".*(OSPF%-5%-ADJCHG.*LOADING to FULL).*") then
    send_okay_email("sean@conman.org",msg)
    send_okay_email("XXXXXXXXXXXXXXXXX",msg)
  end
end

If the routing on our network changes, I get email notification of the event. I also have some code that processes the logs from Postfix. Postfix generates “thin entries,” five entries per email (from various subsystems) when what I really want logged are “fat entries” (which summarize the status of the email in one log line). So, I wrote some code to catch all five lines per email, and then just log a one line summary, thus changing a bunch of “thin entries” to one “fat entry” (what I really want to do is send all mail related logs on all our servers to one central location, so we no longer have to check two or three servers everytime one of our customers complains about email being “too slow” but that's for another show entry).

The Lua code on my personal server also does the Postfix “thin-to-fat” conversion, but it also logs the webserver status every hour (just because I can).

And if you're wondering why I've blathered on and on about a piece of code I've yet to release, it's because I've yet to write any documentation on the darned thing, and I figure this would be a decent first pass at some documentation. And maybe to gauge interest in the project.

Obligatory Picture

[It's the most wonderful time of the year!]

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: http://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

http://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2019 by Sean Conner. All Rights Reserved.