The Boston Diaries

Tuesday, February 09, 2010

Syslogintr—a syslogd replacement in C and Lua

So, why did I decide to write my own syslogd? Well, it was started as something to do while at Bunny's mom's house. I was curious as to the actual protocol used by syslogd because it's always bugged me that the files written by syslogd bear no information about the facility or priority a message originally had. Sure, you can filter by the facility and priority, but the resulting files lose that information.

It was after reading RFC-3164 that I realized it might be possible to filter on more than just the facility and priority. Have Lua handle the logic of log analyzation and I had what I thought was a fun little project.

Heh. Little did I know.

Anyway, it's been in use for a little over three months (and 140 commits to the source repository) and I must say, I'm finding it very useful. Sure, I could have written a Lua script to parse the log files that the traditional syslogd writes, but I'm just cutting out the ~~middleman~~ filesystem.

So the C code accepts, parses and hands a well-structured block of information to the Lua code, and it's the Lua code where all the features are implemented.

The information to Lua is passed in a table (which is Lua for an associative array, or hash table) with the following fields:

Formatted syslog message as passed to Lua
name	type	comments
name	type	comments
version	number	This is the version of the syslog protocol, of which there are two—the documented version 1 which I haven't seen in the wild yet, and the conventional version which I've internally labelled as “version 0”. I currently only support version 0, so this field (as of this writing) is always 0.
_RAW	string	This is the actual raw message as received. It was intended as a debugging aid, but I keep it enabled as it is useful when you get the odd message.
remote	boolean	`true` if the message came in over the network, `false` if it was generated locally on the machine. I use this as a quick test.
host	string	This is really the source of the message. For a remote syslog message, this will contain the IPv4 or IPv6 address. For messages generated locally, this will be the “local socket” (formerly known as “Unix sockets”) the message was written to (which on most systems is `/dev/log`).
relay	string	Usually this will be the same as the `host` field, but when syslog messages are being relayed, this is the intermediate system that relayed the message on. If system A generated a message to B, and B relayed it to C, then the `host` field will contain “A”, but the `relay` field will contain “B”. If, however, it goes A → B → C → D, then `host` will contain “A” and `relay` will have “C” (in other words—the `relay` field will only contain the last relay machine).
port	number	The UDP port number the syslog message was sent from. If the message was sent locally, this will be -1.
timestamp	number	This is the timestamp the message as it was received on the accepting host. This value is suitable for use in the Lua functions `os.date()` and `os.difftime()`.
logtimestamp	number	This is the timestamp as found in the syslog message (if there was one). Otherwise, it will be the same as `timestamp`.
program	string	This is really the tag portion, but since it's mostly the name of the program that generated the syslog message, that's what I called this field. If it wasn't given, the value of this field will be “”' (an empty string, not `nil`).
pid	number	The process ID of the program that sent the message. Most Unix `syslogd` implementations have this as part of the tag, and if found, is set here. Otherwise, it will be 0.
facility	string	The facility, which will be one of the following: kernel user mail daemon auth1 syslog lpr news uucp cron1 auth2 (also known as “authpriv”) ftp ntp auth3 auth4 cron2 local0 through local7
level	string	This is the priority of the message, but the term “priority” never made much sense, so I call it “level.” This will be one of eight values: emerg alert crit err warn notice info debug
msg	string	The actual message, or basically, whatever is left after parsing everything else. This is pretty much a free-format string.

This table is passed to a function called “log()” to handle however. On my development system, this function is simply:

function log(msg)
  writelog(msg)
  sshd(msg)
end

writelog() just logs the message to a single huge logfile (that records the host, program, facility, level and msg fields). It's the sshd() function that's interesting:

if blocked == nil then
  blocked = {}
end

function sshd(msg)
  if msg.remote   == true    then return end
  if msg.program  ~= "sshd"  then return end
  if msg.facility ~= "auth2" then return end
  if msg.level    ~= "info"  then return end

  local ip = string.match(msg.msg,"^Failed password for .* from ::ffff:([%d%.]+) .*");
  if ip == nil then return end

  I_log("debug","Found IP:" .. ip)

  if blocked[ip] == nil then
    blocked[ip] = 1
  else
    blocked[ip] = blocked[ip] + 1
  end

  if blocked[ip] == 5 then
    local cmd = "iptables --table filter --append INPUT --source " .. ip .. " --proto tcp --dport 22 --jump REJECT"
    I_log("debug","Command to block: " .. cmd)    
    os.execute(cmd)    
    I_log("info","Blocked " .. ip .. " from SSH")
    table.insert(blocked,{ ip = ip , when = msg.timestamp} )
  end
end

This checks for local messages from sshd and if it finds five (or more) consecutive failed attempts, adds the IP address of the offending party to the firewall, and logs the attempt (via I_log()). Nothing that a lot of intrusion scripts don't already do, but this is in the syslog daemon, not another process reading this second hand through a file (and without the issues that come up with log rotation).

Now, in order to keep the firewall from filling up, I added yet another feature—the ability to periodically run a Lua function. So, elsewhere in the script I have:

alarm("60m")

function alarm_handler()
  I_log("debug","Alarm clock");
  if #blocked == 0 then
    I_log("debug","Alarm clock---snooze button!")
    return
  end

  local now = os.time()

  I_log("debug",string.format("About to remove blocks (%d left)",#blocked))

  while #blocked > 0 do
    if now - blocked[1].when < 3600 then return end
    local ip = blocked[1].ip
    I_log("info","Removing IP block: " .. ip)      	
    blocked[ip] = nil
    table.remove(blocked,1)
    os.execute("iptables --table filter -D INPUT 1")
  end
end

alarm() informs the C code to call the function alarm_handler() every 60 minutes (if you give alarm() a numeric value, it takes that as a number of seconds between calling the function, if a string, it expects a numeric value followed immediately by “m” for minutes, “h” for hours (so I could have specified “1h” in this case) and “d” for days). The function alarm_handler() then removes the blocks one at a time (once per hour) which is long enough for the scanner to move on.

Now, you may have noticed this bit of odd code:

if blocked == nil then
  blocked = {}
end

One feature of the C code is that if it receives a SIGUSR1, the Lua script will be reloaded. This is to ensure that if there is any IP blocks defined, I don't lose any when the script is reloaded (and yes, that is a handy feature—change the Lua code a bit, and have the daemon reload the script without having to restart the entire daemon).

Along those lines, if the C code receives a SIGHUP, calls the Lua function reload_signal(), which allows the script the ability to close and reopen any logfiles it might be using (thus being compatible with the current syslogd behavior) which is useful for rotating the logs and what not.

I'm also running this on my own server (brevard.conman.org, which runs this blog) and also one of our monitoring servers at The Company (this one not only runs Nagios, Cacti and snmptrapd, but all our routers send their syslog information to this server). The Lua code I have running on the monitoring server not only keeps logging in the same files as the old syslogd (in the same format as well), but also this little bit of code:

function check_ospf(msg)
  if msg.facility ~= 'local1' then
    return
  end

  if string.match(msg.msg,".*(OSPF%-5%-ADJCHG.*Neighbor Down).*") then
    send_emergency_email("sean@conman.org",msg)
    send_emergency_email("XXXXXXXXXXXXXXXXX",msg)
  elseif string.match(msg.msg,".*(OSPF%-5%-ADJCHG.*LOADING to FULL).*") then
    send_okay_email("sean@conman.org",msg)
    send_okay_email("XXXXXXXXXXXXXXXXX",msg)
  end
end

If the routing on our network changes, I get email notification of the event. I also have some code that processes the logs from Postfix. Postfix generates “thin entries,” five entries per email (from various subsystems) when what I really want logged are “fat entries” (which summarize the status of the email in one log line). So, I wrote some code to catch all five lines per email, and then just log a one line summary, thus changing a bunch of “thin entries” to one “fat entry” (what I really want to do is send all mail related logs on all our servers to one central location, so we no longer have to check two or three servers everytime one of our customers complains about email being “too slow” but that's for another ~~show~~ entry).

The Lua code on my personal server also does the Postfix “thin-to-fat” conversion, but it also logs the webserver status every hour (just because I can).

And if you're wondering why I've blathered on and on about a piece of code I've yet to release, it's because I've yet to write any documentation on the darned thing, and I figure this would be a decent first pass at some documentation. And maybe to gauge interest in the project.

Tuesday, February 09, 2010

Syslogintr—a syslogd replacement in C and Lua

Obligatory Picture

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer