The Boston Diaries

Thursday, April 01, 2010

Notes about the past few weeks

So.

What happened?

Well …

After the trip to Orlando for a surprise birthday party, there was a another Ruby users group meeting, which immediately afterwards I came down with my third cold of this year (seriously, since January—either that, or it's been the same cold for three months; hard to say).

A week later I (along with Bunny and Wlofie) attended a special viewing of the works of M. C. Escher at the Boca Raton Museum of Art—a guided tour by the owner of the collection. Being a fan of M. C. Escher, I got a lot out of the guided tour, saw a bunch of works I've never seen before, and got to see some of the actual wood cuts and stones used to make his prints (and as the owner kept saying, any number of single works would have been enough to get Escher's name into history, so incredible was his wood engraving skills). I do want to go back and take my time viewing the exhibit (since the tour was rather quick—less than two hours with over 60 people following the owner around the gallery).

A week after that and I'm just now getting over the cold (clinic, antibiotics, blah blah).

And … well … I just haven't felt like writing all that much [You don't say? —Editor] [Shutup. —Sean], but I have been doing this for ten years now … perhaps it's time to close up shop.

Saturday, April 03, 2010

I'll only upgrade software if there's a compelling reason to, and for me, mod_lua is a compelling reason to upgrade Apache

Nah, it's not quite time to close up shop … (so much for my April Fool's joke this year—most people missed the style changes I did for several years running, but a) most people read the entries here via the newsfeed so the visual change in layout was always lost on them, and b) I never did find that round toit I needed to change the style—anyway, I digess).

I've been looking a bit deeper into Drupal these past few days (seeing how I'm scheduled to give a repeat of my talk at the new West Palm Beach Drupal users group this month—I'm giving a lot of presentations this year it seems) and trying to get into the whole PHP framework and well … as a diversion, I thought it might be interesting to see what type of web-based framework one could do in Lua, and why not attempt it using mod_lua?

Well, the fact that I linked to the svn respository should say something about the stability of mod_lua—it ain't. It's only currently available for the latest development version of Apache, there's no documentation (except for the source code) and a smattering of example code to guide the intrepid. It's also not terribly reassuring that it hasn't been worked on for a few months.

That didn't stop me from trying it though.

I spent a few hours debugging the module, enough for it to pass the few tests available and hopefully, the Apache team will accept the patch (a call to memset() to initialize a structure to a known value before use).

Now that it doesn't crash, it does appear to be quite nice, allowing the same access that any Apache module in C would have, and it looks like one could effectively replace a few of the murkier modules (like mod_rewrite) with more straightforward Lua implementation. My initial thoughts are to reimplement mod_litbook (which currently only works for Apache 1.3x) using mod_lua as a test case (and heck—maybe even upgrade the existing mod_litbook to Apache 2.x so I won't have to keep running an Apache 1.3 instance just for one portion of my website).

Sunday, April 04, 2010

I can haz Easter Bunny. I eated it.

[Yeah, so it's not the traditional Easter dinner. It was tasty though]

Tuesday, April 06, 2010

Client certificates in Apache

I've been spending an inordinate amount of time playing around with Apache, starting with mod_lua, which lead me to reconfigure both Apache 2.0.52 (which came installed by default) and Apache 2.3.5 (compiled from source, because mod_lua is only available for Apache 2.3) so they could run at the same time. This lead to using IPv6 because I have almost two dozen “sites” running locally (and as I've found, it's just as easy to use IPv6 addresses as it is IP addresses, although the DNS PTR records get a little silly).

This in turn lead to installing more secure sites locally, because I can (using TinyCA makes it trivial actually), and this lead to a revamp of my secure site (note: the link takes you to an unsecure page—the actual secure site uses a certificate signed by my “certificate authority” which means you'll get a warning which can be avoided by installing the certificate from the unsecure site). And from there, I learned a bit more about authenticating with client certificates. Specifically, isolating certain pages to just individual users.

So, to configure client side certificates, you need to create a client certificate (easy with TinyCA as it's an option when signing a request) and install it in the browser. You then need to install the certificate authority certificate so that Apache can use it to authenticate against the client certificate (um … yeah). In the Apache configuration file, just add:

SSLCACertificateFile	/path/to/ca.crt

Then add the appropriate mod_ssl options to the secure site (client-side authentication only works with secure connections). For example, here's my configuration:

<VirtualHost 66.252.224.242:443>
  ServerName	secure.conman.org
  DocumentRoot	/home/spc/web/sites/secure.conman.org/s-htdocs
  
  # ...

  <Directory /home/spc/web/sites/secure.conman.org/s-htdocs/library>
    SSLRequireSSL
    SSLRequire %{SSL_CLIENT_S_DN_O}  eq "Conman Laboratories" \
           and %{SSL_CLIENT_S_DN_OU} eq Clients"
    SSLVerifyClient	require
    SSLVerifyDepth	5
  </Directory>
</VirtualHost>

And in order to protect a single file with more stringent controls (and here for example, is my bookmarks file):

<VirtualHost 66.252.224.242:443>

  # ... 

  <Location /library/bookmarks.html>
    SSLRequireSSL
    SSLRequire %{SSL_CLIENT_S_DN_O}  eq "Conman Laboratories" \
           and %{SSL_CLIENT_S_DN_CN} eq "Sean Conner"
    SSLVerifyClient	require
    SSLVerifyDepth	5
  </Location>
</VirtualHost>

The <Files> directive in Apache didn't work—I suspect because the <Directory> directive is processed first and it allows anybody from the unit “Clients” access and thus any <Files> directives are ignored, whereas <Location> directives are processed before <Directory> directives, and thus anyone not me is denied access to my bookmarks.

Now, I just need to figure out what to do about some recent updates to Apache, since I have some “old/existing clients” to support (namely, Firefox 2 on my Mac, which I can't upgrade because I'm stuck at 10.3.9 on the system, because the DVD player is borked … )

IF IT AIN'T BROKE DON'T FIX IT!!!!!!!!!

Sigh.

I can fix the client certificate issue if I install the latest Apache 2.2, which has the SSLInsecureRenegotiation option, but that requires OpenSSL 0.9.8m or higher (and all this crap because of a small bug in OpenSSL). So, before mucking with my primary server, I decide to test this all out on my home computer (running the same distribution of Linux as my server).

Well, I notice that OpenSSL just came out with verion 1.0.0, so I decide to snag that version. Download, config (what? No configure still?), make and make install, watch it go into the wrong location (XXXXXX I wanted it in /usr/local/lib/ no /usr/local/openssl/lib!), rerun config with other options and get it where I want it.

Okay.

And hey, while I'm here, might as well download the latest OpenSSH and get that working. I nuke the existing OpenSSH installtion (yum remove openssh) since I won't need it, and start the configure, make and make install, but the configure script bitches about the version of zlib installed (XXXX! I know RedHat is conservative about using the latest and greatest, but come on! It's been five years since version 1.2.3 came out! Sheesh!) so before I can continue, I must do the download, configure, make and make install dance for zlib. Once that is out of the way …

checking OpenSSL header version... 1000000f (OpenSSL 1.0.0 29 Mar 2010)
checking OpenSSL library version... 90701f (OpenSSL 0.9.7a Feb 19 2003)
checking whether OpenSSL's headers match the library... no
configure: error: Your OpenSSL headers do not match your
library. Check config.log for details.
If you are sure your installation is consistent, you can disable the check
by running "./configure --without-openssl-header-check".
Also see contrib/findssl.sh for help identifying header/library mismatches.

Oh XXXXXX XXXX …

IT'S IN /usr/local/lib YOU USELESS SCRIPT!

But alas, no amount of options or environment variables work. And no, while I might be willing to debug mod_lua, I am not about to debug a 31,000 line shell script. Might as well reinstall the OpenSSH package …

[root]lucy:~>yum install openssh
Setting up Install Process
Setting up repositories
Segmentation fault (core dumped)

Um … what?

[root]lucy:~>yum install openssh
Setting up Install Process
Setting up repositories
Segmentation fault (core dumped)

What the XXXX?

Oh please oh please oh please don't tell me that yum just assumes you have OpenSSH installed …

Okay, where is this program dying?

[root]lucy:/tmp>gdb /usr/bin/yum core.3783 
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"..."/usr/bin/yum": not in
executable format: File format not recognized

Core was generated by /usr/bin/python /usr/bin/yum search zlib'.
Program terminated with signal 11, Segmentation fault.
#0  0x007ff3a3 in ?? ()
(gdb)

Oh … it's Python.

Um ‥ wait a second …

It's … Python! It's a script!

WHAT THE XXXX?

What did I do to cause the Python interpreter to crash?

Aaaaaaaaaaaaaaaaaaaaaaaaaah!

Okay, I managed to find some RPMs of OpenSSH to install. That didn't fix yum.

Okay, don't panic.

Obviously, it's something I've done that caused this.

The only things I've done is to install up libraries in /usr/local/lib.

Okay, keep any programs from loading up anything from /usr/local/lib. That's easy enough—I justed edited /etc/ld.so.conf to remove that directory, and ran ldconfig. Try it again.

Okay, yum works!

And through a process of elimination, I found the culprit—zlib! Apparently, the version of Python I have doesn't like zlib 1.2.4.

Sheesh!

Okay, yes, I bring ths upon myself for not running the latest and greatest. I don't update continously because that way lies madness—things just breaking (in fact, the last thing I did upgrade, which was OpenSSL on my webserver the other day, broke functionality I was using, which prompted this whole mess in the first place!). At least I was able to back out the changes I made, but I have to keep this in mind:

IF IT AIN'T BROKE DON'T FIX IT!!!!!

Write Apache modules quickly in Lua

I really like mod_lua, even in its alpha state. In less than five minutes I had a webpage that would display a different quote each time it was referenced. I was able to modify the Lua based qotd, changing:

QUOTESFILE = "/home/spc/quotes/quotes.txt"
quotes     = {}

do
  local eoln = "\r\n"
  local f    = io.open(QUOTESFILE,"r")
  local s    = ""

  for line in f:lines() do
    if line == "" then
      -- each quote is separated by a blank link
      if #s < 512 then
        table.insert(quotes,s)
      end
      s = ""
    else
      s = s .. line .. eoln
    end
  end

  f:close()
end

math.randomseek(os.time())

function main(socket)
  socket:write(quotes[math.random(#quotes)])
end

QUOTESFILE = "/home/spc/quotes/quotes.txt"
quotes     = {}

do
  local eoln = "\r\n"
  local f    = io.open(QUOTESFILE,"r")
  local s    = ""

  for line in f:lines() do
    if line == "" then
      -- each quote is separated by a blank link
      if #s < 512 then
        table.insert(quotes,s)
      end
      s = ""
    else
      s = s .. line .. eoln
    end
  end

  f:close()
end

math.randomseek(os.time())

function handler(r)
  r.content_type = "text/plain"
  r:puts(quotes[math.random(#quotes)])
end

(you can see, it didn't take much), and adding

LuaMapHandler /quote.html /home/spc/web/lua/lib/quote.lua

to the site configuration (what you don't see is the only other line you need, LuaRoot), reload Apache and I now have webpage backed by Lua.

And from there, it isn't much to add some HTML to the output, but it should be clear that adding Apache modules in Lua isn't that hard.

What did take me by surprise is that there's no real way to do the heavy initialization just once. That bit of reading in the quotes file? It's actually done for every request—mod_lua just compiles the code and keeps the compiled version cached and for each request, runs the compiled code. It'd be nice if there was a way to do some persistent initialization once (a feature I use in the current mod_litbook), but as written, mod_lua doesn't have support for that.

I also haven't see any action on my bug report—not a good sign.

I'm wondering if I might not have to pick up ~~the ball~~ mod_lua and run with it …

Wednesday, April 07, 2010

Dependencies and side effects

“Well, that's your problem,” I said, looking at my computer sitting there, powered off. Bunny had been unable to check her bank account and thinking our network connection was bad, powercycled the router and DSL unit, but was still unable to connect. The real issue was my computer being off—all the computers here at Chez Boca use my computer for DNS resolution.

“Yeah, something happened with the power, but I'm not sure what,” said Bunny. And yes, it was odd; none of the clocks are blinking “12:00” and from what it sounds like, she didn't hear any USP alarms (I think she would have mentioned those if she did hear them) so something odd did happen. And later (to get ahead of the story a bit) when I did check the UPS logs, I found (output from syslogintr):

/dev/log | apcupsd | daemon info | Apr 04 18:23:58 | 000.0,000.0,000.0,27.10,00.00,40.0,00.0,000.0,000.0,122.0,100.0,0
/dev/log | apcupsd | daemon info | Apr 04 18:29:00 | 000.0,000.0,000.0,26.60,00.00,40.0,00.0,000.0,000.0,120.0,100.0,1
/dev/log | apcupsd | daemon info | Apr 04 18:34:02 | 000.0,000.0,000.0,26.43,00.00,40.0,00.0,000.0,000.0,120.0,100.0,0
/dev/log | apcupsd | daemon info | Apr 04 18:39:05 | 000.0,000.0,000.0,26.27,00.00,40.0,00.0,000.0,000.0,120.0,100.0,1
/dev/log | apcupsd | daemon info | Apr 04 18:44:07 | 000.0,000.0,000.0,26.27,00.00,40.0,00.0,000.0,000.0,122.0,100.0,0
/dev/log | apcupsd | daemon info | Apr 04 18:49:10 | 000.0,000.0,000.0,26.27,00.00,40.0,00.0,000.0,000.0,121.0,100.0,1
/dev/log | apcupsd | daemon info | Apr 04 18:54:12 | 000.0,000.0,000.0,26.27,00.00,46.0,00.0,000.0,000.0,120.0,100.0,0
/dev/log | apcupsd | daemon info | Apr 04 18:59:15 | 000.0,000.0,000.0,26.27,00.00,45.0,00.0,000.0,000.0,120.0,100.0,1
/dev/log | apcupsd | daemon info | Apr 04 19:04:17 | 000.0,000.0,000.0,26.27,00.00,46.0,00.0,000.0,000.0,120.0,100.0,0
/dev/log | apcupsd | daemon info | Apr 04 19:09:20 | 000.0,000.0,000.0,26.10,00.00,46.0,00.0,000.0,000.0,122.0,100.0,1
/dev/log | apcupsd | daemon info | Apr 04 19:14:22 | 000.0,000.0,000.0,26.10,00.00,46.0,00.0,000.0,000.0,122.0,100.0,0
/dev/log | apcupsd | daemon info | Apr 04 19:19:24 | 000.0,000.0,000.0,26.10,00.00,46.0,00.0,000.0,000.0,121.0,100.0,1
/dev/log | apcupsd | daemon info | Apr 04 19:24:27 | 000.0,000.0,000.0,26.10,00.00,46.0,00.0,000.0,000.0,121.0,100.0,0
/dev/log | apcupsd | daemon info | Apr 04 19:29:29 | 000.0,000.0,000.0,26.10,00.00,46.0,00.0,000.0,000.0,122.0,100.0,1
/dev/log | apcupsd | daemon info | Apr 04 19:34:32 | 000.0,000.0,000.0,26.10,00.00,40.0,00.0,000.0,000.0,122.0,100.0,0
/dev/log | apcupsd | daemon info | Apr 04 19:39:34 | 000.0,000.0,000.0,26.10,00.00,40.0,00.0,000.0,000.0,121.0,100.0,1
/dev/log | apcupsd | daemon info | Apr 04 19:44:37 | 000.0,000.0,000.0,26.10,00.00,40.0,00.0,000.0,000.0,122.0,100.0,0
/dev/log | apcupsd | daemon info | Apr 04 19:50:36 | 000.0,000.0,000.0,26.10,00.00,40.0,00.0,000.0,000.0,121.0,099.0,1
/dev/log | apcupsd | daemon info | Apr 04 19:55:53 | 000.0,000.0,000.0,25.94,00.00,40.0,00.0,000.0,000.0,121.0,098.0,0
/dev/log | apcupsd | daemon info | Apr 04 20:00:56 | 000.0,000.0,000.0,25.94,00.00,40.0,00.0,000.0,000.0,122.0,098.0,1

So, starting just past 18:23:58 on April 4^th something odd was happening to my UPS (which all the computers in The Home Office are hooked up to), causing the battery voltage (normally 27.1v) and battery charge (100.0 percent) to drop. And those values kept dropping until:

/dev/log | apcupsd | daemon info | Apr 07 12:31:05 | 000.0,000.0,000.0,25.60,00.00,36.0,00.0,000.0,000.0,119.0,084.0,0
/dev/log | apcupsd | daemon info | Apr 07 12:36:08 | 000.0,000.0,000.0,25.60,00.00,36.0,00.0,000.0,000.0,121.0,084.0,1
/dev/log | apcupsd | daemon info | Apr 07 12:41:10 | 000.0,000.0,000.0,25.60,00.00,36.0,00.0,000.0,000.0,119.0,084.0,0

The battery voltage fell to 25.6V and the battery charge to 84% and past that … well, nothing because the computers lost power at that point (or anywhere up til 12:46:12 when the next log should have appeared). So no real clues as to what happened with the power, but I digress—back to the story.

I hit the power button on my computer, it bitches about the disk being corrupt (which I ignore, running a journaled filesystem) and when it gets to starting up syslogd (which is in reality my syslogintr) and klogd (which logs messages from the kernel to syslog()) when it hangs.

Hmm, I thought. Perhaps I better let fsck run, just to be safe. Powercycle, hit 'Y' to let fsck run for about fifteen minutes, then watch as it hangs yet again on syslogd/klogd.

Now, I won't bore you with the next few hours (which basically involved continously booting into single user mode and trying to puzzle out what was going wrong) but in the end, the problem ended up being syslogintr.

Or rather, not with the actual C code in syslogintr, but with the Lua script it was running. It actually had to do with blocking ssh attempts via iptables. See, syslog/klogd start up before the network is initialized (and by extension, iptables), and aparently, running iptables before the network and klogd is running really messes up the boot process in some odd way that locks the system up (not that this is the first time I've seen such weird interactions before—back in college I learned the hard way that under the right circumstances (which happened all too often) screen and IRC under Irix4.0.5 would cause a kernel panic, thus stopping the computer cold).

Once I figured that out, it was a rather simple process to remove the ssh blocking feature from the script, and now I'm stuck with a weird dependency issue—or I can just remove entirely the ssh blocking code from syslogintr (or at least the Lua script it runs).

Sigh.

Notes on a conversation during an impromptu UPS test

“So what should the UPS do when the power goes out?” asked Bunny.

“It sounds an alarm, and I should have,” I said, turning to the keyboard and typing a command, “nine minutes of power.”

“Oh really?”

“Yes. Well, let's test it,” I said, getting up, and pulling the plug on the UPS. About five seconds later, it started beeping. “See?”

“Hmm … I see,” said Bunny. “And then what?”

“Well, it's enough time for either the power to come back up, or to shutdown the computers. You don't really need—”

Just then all our computers suddenly lost power.

“Oh, well that was interesting,” I said.

“I thought you said you had nine minutes.”

“Apparently, so did the UPS.”

Friday, April 09, 2010

Cache flow problems, II

Google just announced that website speed is part of their ranking criteria (link via Hacker News), and seeing how Dad is still reporting issues with viewing this blog, I figured I might as well play around with Page Speed (which requires Firebug, an incredible website debugging tool that runs in Firefox) and see if I can't fix the issue (and maybe speed up the site).

Now, I realize there isn't any real need to speed up my site, but the suggestions by Page Speed weren't horrible, and actually, not terribly hard to implement (seeing how the main website here consists of nothing but static pages, with the only dynamic content here on the blog) and mainly consisted of tuning various caching options on the pages, and some minor CSS tweaks to keep Page Speed happy.

The caching tweaks for the main site I made were:

FileETag	MTime Size
AddType		"image/x-icon" .ico
ExpiresActive	On
ExpiresDefault	"access plus 1 year"
ExpiresByType	text/html "access plus 1 week"
ExpiresByType	image/x-icon "access plus 1 month"

<LocationMatch "\.(ico|gif|png|jpg|jpeg)$">
	Header append Cache-Control "public"
</LocationMatch>

HTML pages can be cached for a week, favicon.ico can be cached for a month, and everything else for a year. Yes, I could have made favicon.ico cache for a year, but Page Speed suggested at least a month, so I went with that. I can always change that later. I may revisit the caching for HTML pages later; make non-index pages cachable for a year, and index pages a week, but for now, this is fine.

And it does make the pages load up faster, at least for subsequent visits. Also, Page Speed and YSlow both give me high marks (YSlow dings me for not using a CDN, but my site isn't big enough to require a CDN, but that's the only thing YSlow doesn't like about my site).

And as an attempt to fix Dad's issue, I added the following to the configuration for The Boston Diaries:

<Files index.html>
  Header set Cache-Control "no-cache"
</Files>

Basically, no one is allowed to cache the main page for this blog. I'll see how well that works, although it may take a day or two.

Saturday, April 17, 2010

The monitoring of uninterruptable power supplies

I've been dealing with UPS problems for a week and a half now, and it's finally calmed down a bit. Bunny's UPS has been replaced, and I'm waiting for Smirk to order battery replacements for my UPS so in the mean time, I'm using a spare UPS from The Company.

Bunny suspects the power situation here at Chez Boca is due to some overgrown trees interfering with the power lines, causing momentary fluctuations in the power and basically playing hell with not only the UPSes but the DVRs as well. This past Wednesday was particuarly bad—the UPS would take a hit and drop power to my computers, and by the time I got up and running, I would take another hit (three times, all within half an hour). It got so bad I ended up climbing around underneath the desks rerunning power cables with the hope of keeping my computers powered for more than ten minutes.

It wasn't helping matters that I was fighting my syslogd replacement during each reboot (but that's another post).

So Smirk dropped off a replacement UPS, and had I just used the thing, yesterday might have been better. But nooooooooooooooooo! I want to monitor the device (because, hey, I can), but since it's not an APC, I can't use apcupsd to monitor it (Bunny's new UPS is an APC, and the one I have with the dead battery is an APC). In searching for some software to monitor the Cyber Power 1000AVR LCD UPS, I came across NUT, which supports a whole host of UPSes, and it looks like it can support monitoring multiple UPSes on a single computer (functionality that apcupsd lacks).

It's nice, but it does have its quirks (and caused me to have nuclear meltdowns yesterday). I did question the need for five configuration files and its own user accounting system, but upon reflection, the user acccounting system is probably warranted (maybe), given that you can remotely command the UPSes to shutdown. And the configurations files aren't that complex; I just found them annoying. I also found the one process per UPS, plus two processes for monitoring, a bit excessive, but the authors of the program were following the Unix philosophy of small tools collectively working together. Okay, I can deal.

The one quirk that drove me towards nuclear meltdown was the inability of the USB “driver” (the program that actually queries the UPS over the USB bus) to work properly when a particular directive was present in the configuration file and running in “explore” mode (used to query the UPS for all its information). So I have the following in the UPS configuration file:

[apc1000]
	driver = usbhid-ups
	port = auto
	desc = "APC Back UPS XS 1000"
	vendorid = 051D

I try to run usbhid-ups in explore mode, and it fails. Comment out the vendorid, but add it to the commnd line, and it works. But without the vendorid, the usbhid-ups program wouldn't function normally (it's the interface between the monitoring processes and the UPS).

It's bad enough that you can only use the explore mode when the rest of the UPS monitoring software isn't running, but this? It took me about three hours to figure out what was (or wasn't) going on.

You can obviously generate kilowatt usage, yet I can't query for it over USB? Not even as a vendor extention? You suck!]

Then there was the patch I made to keep NUT from logging every second to syslogd (I changed one line from “if result > 0 return else log error” to “if result >= 0 return else log error” since 0 isn't an error code), then I found this bug report on the mailing list archive, and yes, that bug was affecting me as well; after I applied the patch, I was able to get more informtion from the Cyber Power UPS (and it didn't affect the monitoring of the APC).

And their logging program, upslog, doesn't log to syslogd. It's not even an option. I could however, have it output to stdout and pipe that into logger, but that's an additional four processes (two per UPS) just to log some stats into syslogd. Fortunately, the protocol used to communicate with the UPS monitoring software is well documented and easy to implement, so it was an easy thing to write a script (Lua, of course) to query the information I wanted to log to syslogd and run that every five minutes via cron.

Now, the information you get is impressive. apcupsd gives out rather terse information like (from Bunny's system, which is still running apcupsd):

APC      : 001,038,0997
DATE     : Sat Apr 17 22:23:25 EDT 2010
HOSTNAME : bunny-desktop
VERSION  : 3.14.6 (16 May 2009) debian
UPSNAME  : apc-xs900
CABLE    : USB Cable
MODEL    : Back-UPS XS  900 
UPSMODE  : Stand Alone
STARTTIME: Thu Apr 08 23:20:10 EDT 2010
STATUS   : ONLINE 
LINEV    : 118.0 Volts
LOADPCT  :  16.0 Percent Load Capacity
BCHARGE  : 084.0 Percent
TIMELEFT :  48.4 Minutes
MBATTCHG : 5 Percent
MINTIMEL : 3 Minutes
MAXTIME  : 0 Seconds
SENSE    : Low
LOTRANS  : 078.0 Volts
HITRANS  : 142.0 Volts
ALARMDEL : Always
BATTV    : 25.9 Volts
LASTXFER : Unacceptable line voltage changes
NUMXFERS : 6
XONBATT  : Fri Apr 16 00:40:37 EDT 2010
TONBATT  : 0 seconds
CUMONBATT: 11 seconds
XOFFBATT : Fri Apr 16 00:40:39 EDT 2010
SELFTEST : NO
STATFLAG : 0x07000008 Status Flag
MANDATE  : 2007-07-03
SERIALNO : JB0727006727  
BATTDATE : 2143-00-36
NOMINV   : 120 Volts
NOMBATTV :  24.0 Volts
NOMPOWER : 540 Watts
FIRMWARE : 830.E6 .D USB FW:E6
APCMODEL : Back-UPS XS  900 
END APC  : Sat Apr 17 22:24:00 EDT 2010

NUT will give back:

battery.charge: 42
battery.charge.low: 10
battery.charge.warning: 50
battery.date: 2001/09/25
battery.mfr.date: 2003/02/18
battery.runtime: 3330
battery.runtime.low: 120
battery.type: PbAc
battery.voltage: 24.8
battery.voltage.nominal: 24.0
device.mfr: American Power Conversion
device.model: Back-UPS RS 1000
device.serial: JB0307050741  
device.type: ups
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.vendorid: 051D
driver.version: 2.4.3
driver.version.data: APC HID 0.95
driver.version.internal: 0.34
input.sensitivity: high
input.transfer.high: 138
input.transfer.low: 97
input.transfer.reason: input voltage out of range
input.voltage: 121.0
input.voltage.nominal: 120
ups.beeper.status: disabled
ups.delay.shutdown: 20
ups.firmware: 7.g3 .D
ups.firmware.aux: g3 
ups.load: 2
ups.mfr: American Power Conversion
ups.mfr.date: 2003/02/18
ups.model: Back-UPS RS 1000
ups.productid: 0002
ups.serial: JB0307050741  
ups.status: OL CHRG
ups.test.result: No test initiated
ups.timer.reboot: 0
ups.timer.shutdown: -1
ups.vendorid: 051d

Same information, but better variable names, plus you can query for any number of variables. Not all UPSes support all variables, though (and there are plenty more variables that my UPSes don't support, like temperature). You can also send commands to the UPS (for instance, I was able to shut off the beeper on the failing APC) using this software.

So yes, it's nice, but its quirky nature was something I wasn't expecting after a week of electric musical chairs.

Sunday, April 18, 2010

Off to the races

I mentioned briefly yesterday about the issue I was having with syslogintr while booting the computer. On my system, the system would hang just after loading syslogintr. I tracked it down to initlog hanging. Further investigation revealed that both syslogintr and initlog were hanging, but the significance of that escaped me until an epiphany I had while sleeping: I was experiencing yet another race condition!

A quick test today proved that yes, it was a race condition. A particularly nasty race condition too, since once again, I wasn't explicitly writing multi-threaded code.

syslogintr creates a local socket (/dev/log for those ~~that~~ who are curious) and then waits to receive logging messages sent to said socket, something like:

local = socket(...)

while(!interrupted)
{
  read(socket,buffer,sizeof(buffer));
  process(buffer);
}

But in the process of processing the incoming message, syslogintr may itself call syslog():

while(!interrupted)
{
  read(socket,buffer,sizeof(buffer));
  process(buffer);
  syslog(LOG_DEBUG,"processed message");
}

syslog() (which is part of the standard library under Unix) sends the message to the local socket (/dev/log). The data is queued up in the socket, but it's okay because it'll cycle around quickly enough to pick up the new data. Unless there's too much data already queued in the local socket, at which point whoever calls syslog() will block until the backlogged data in the local socket is dealt with.

The startup script (/etc/init.d/syslog for those of you following along at home) starts both syslogintr and klogd. klogd is a program that pulls the data from the kernel logging queue (logging messages the kernel itself generates, but the kernel can't use /dev/log as that's just a Unix convention, not something enforced by the kernel itself) and logs that data via syslog(). And by the time klogd starts up, there's quite a bit of logging data generated by the kernel. So that data gets blasted at syslogintr (and in the process, so much data is being sent that klogd is blocked from running). But syslogintr is still coming up to speed and generating a bunch of internal messages and suddenly, its calls to syslog() are blocking, thus causing a deadlock:

while(!interrupted)
{
  read(socket,buffer,sizeof(buffer));
  process(buffer);	/* this takes some time */

  /*--------------------------------------------------
  ; meanwhile, something like klogd could be blasting
  ; data to the local socket, filling it up, thus when
  ; we get to:
  ;---------------------------------------------------*/

  syslog(LOG_DEBUG,"processed message");

  /*-------------------------------------------------
  ; the call to syslog() blocks, thus blocking the
  ; program until something else (in this case, *us*)
  ; drains the data waiting in the socket.  But we
  ; can't drain the data because we're waiting (via
  ; syslog()) for the data to be drained!
  ;
  ; Can you say, *deadlock* boys and girls?  I knew
  ; you could.
  ;--------------------------------------------------*/
}

This also explains why it only happened when booting—because that's about the only time so much data is pushed to syslogintr that socket operations (reading, writing) are blocked. It also explains why I haven't seen it on any other system I'm running it on, since those systems don't run klogd (being virtual hosts, they don't have klogd).

If you've ever wondered why software tends to crash all the time, it's odd interactions like this that are the cause (and this was an easy problem to diagnose, all things considered).

So now I internally queue any logging messages and handle them in the main loop, something along the lines of:

while(!interrupted)
{
  foreach msg in queued_messages
    process(msg);

  read(socket,buffer,sizeof(buffer));
  process(buffer);
  queuelog(LOG_DEUBG,"processed message");
}

Monday, April 19, 2010

Geek Power: Steven Levy Revisits Tech Titans, Hackers, Idealists

In the last chapters of Hackers, I focused on the threat of commercialism, which I feared would corrupt the hacker ethic. I didn't anticipate that those ideals would remake the very nature of commerce. Yet the fact that the hacker ethic spread so widely—and mingled with mammon in so many ways—guaranteed that the movement, like any subculture that breaks into the mainstream, would change dramatically. So as Hackers was about to appear in a new edition (this spring, O’Reilly Media is releasing a reprint, including the first digital version), I set out to revisit both the individuals and the culture. Like the movie Broken Flowers, in which Bill Murray embarks on a road trip to search out his former girlfriends, I wanted to extract some meaning from seeing what had happened to my subjects over the years, hoping their experiences would provide new insights as to how hacking has changed the world—and vice versa.

I could visit only a small sample, but in their examples I found a reflection of how the tech world has developed over the past 25 years. While the hacker movement may have triumphed, not all of the people who created it enjoyed the same fate. Like Gates, some of my original subjects are now rich, famous, and powerful. They thrived in the movement's transition from insular subculture to multibillion-dollar industry, even if it meant rejecting some of the core hacker tenets. Others, unwilling or unable to adapt to a world that had discovered and exploited their passion— or else just unlucky—toiled in obscurity and fought to stave off bitterness. I also found a third group: the present-day heirs to the hacker legacy, who grew up in a world where commerce and hacking were never seen as opposing values. They are bringing their worldview into fertile new territories and, in doing so, are molding the future of the movement.

Geek Power: Steven Levy Revisits Tech Titans, Hackers, Idealists | Magazine

My own copy of Hackers: Heros of the Computer Revolution is worn out from so many readings and re-readings that it's falling apart (and when I first got it, back in 1986 or so, I read the entire book in one sitting, which lasted all night—not something I should have done on a school night).

So now here is Steven Levy, revisiting his own book from a twenty-five year perspective, and following up on the changes to the industry, and the people he interviews, since the early 80s.

Tuesday, April 20, 2010

When “No error” is actually an error

My patch to NUT was rejected:

No.

The above is an error condition (despite the 'No error' message), most likely due to buggy UPS firmware. Normally, we should not expect that when asking for a report, the UPS returns nothing. After all it is 'advertising' the report in the report descriptor, so silently ignoring this would be a grievous mistake. At the very least, if someone is debugging the we should provide some indication why this fails.

Hmm … okay. I thought they just missed typed a conditional, since 0 is used to indicate success throughout a mess of Standard C library (and Unix library) calls (silly me!).

The cause of the “No error” message is this bit of code:

/*
 * Error handler for usb_get/set_* functions. Return value > 0 success,
 * 0 unknown or temporary failure (ignored), < 0 permanent failure (reconnect)
 */
static int libusb_strerror(const int ret, const char *desc)
{
        if (ret > 0) {
                return ret;
        }

        switch(ret)
        {
        case -EBUSY:    /* Device or resource busy */
        case -EPERM:    /* Operation not permitted */
        case -ENODEV:   /* No such device */
        case -EACCES:   /* Permission denied */
        case -EIO:      /* I/O error */
        case -ENXIO:    /* No such device or address */
        case -ENOENT:   /* No such file or directory */
        case -EPIPE:    /* Broken pipe */
        case -ENOSYS:   /* Function not implemented */
                upslogx(LOG_DEBUG, "%s: %s", desc, usb_strerror());
                return ret;

        case -ETIMEDOUT:        /* Connection timed out */
                upsdebugx(2, "%s: Connection timed out", desc);
                return 0;

        case -EOVERFLOW:        /* Value too large for defined data type */
        case -EPROTO:   /* Protocol error */
                upsdebugx(2, "%s: %s", desc, usb_strerror());
                return 0;

        default:        /* Undetermined, log only */
                upslogx(LOG_DEBUG, "%s: %s", desc, usb_strerror());
                return 0;
        }
}

While I have yet to find the code for usb_strerror() (and I've searched every file; I have no clue as to where the definition of usb_strerror() is located), it acts as if it's just a wrapper around strerror() (a Standard C library call), and when given a value of 0, it returns “No error” (since 0 isn't considered an error value). I submitted back a patch to print “Expected result not received”, since that seems to be what a 0 result means.

Also notice that the comment describing the results is somewhat lost at the top there—in the actual code it's even more invisible since there isn't much to visually set it off from the rest of the code.

Hopefully, the new patch I submitted will be accepted.

Thursday, April 22, 2010

An army of Sean

MyFaceSpaceBook apparently makes profile pages—a short link to your page on MyFaceSpaceBook. So I tried http://www.facebook.com/sean.conner and … oh … unless I have a really deep tan, that isn't me. I then tried http://www.facebook.com/sean.patrick.conner and … um … closer, but still not quite me.

I'm not even on the first page of results.

Online in one form or another since 1987, and I'm failing at MyFaceSpaceBook.

Sigh. [Hey you kids! Get off my lawn!]

I ended up with http://www.facebook.com/spc476, which at least matches my ID across several other websites.

Only 25 days to Vegas? Sign me up!

On the advice of his attorney, my friend Hoade hocked his wife's three cats and the silverware to buy a cherry red Chevy Impala convertable and is threatening to kidnap me on a wild road trip to Viva Lost Wages. I was curious as to route we might take when I noticed that Google Maps offered walking directions.

How very amusing.

But the 358 steps in walking to Vegas pale in comparison to the 1,008 steps in biking to Viva Lost Wages.