The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Friday, January 11, 2013

Did I repeat myself? Did I repeat myself? Did I repeat myself? Let me fix that.

“Did you mean to post four copies of your post?” asked Bunny.

“No, why?”

“Because there are four copies of your post, that's why,” said Bunny.

“How do you link to the post in question like that when you speak?” I asked. She shook her head and walked away. “Bunny?”


So yes, there was a glitch last night. The only way for four copies to show up like that would be for four copies to be sent via email (which is my preferred interface). And in checking the logs, I did indeed see four copies show up:

Jan 11 05:03:58 brevard postfix/local: 084412EB72E4: to=<XXXXXXXXXXXXXXXXXXXXX>, relay=local, delay=3, status=deferred (Command died with signal 6: "/home/spc/web/sites/boston.conman.org/htdocs/boston.cgi --config /home/spc/web/sites/boston.conman.org/journal/boston.cnf --email --cmd new". Command output: *** glibc detected *** double free or corruption (fasttop): 0x09cf8d38 *** )
Jan 11 05:27:10 brevard postfix/local: 084412EB72E4: to=<XXXXXXXXXXXXXXXXXXXXX>, relay=local, delay=1395, status=deferred (Command died with signal 6: "/home/spc/web/sites/boston.conman.org/htdocs/boston.cgi --config /home/spc/web/sites/boston.conman.org/journal/boston.cnf --email --cmd new". Command output: *** glibc detected *** double free or corruption (fasttop): 0x08646d38 *** )
Jan 11 06:00:29 brevard postfix/local: 084412EB72E4: to=<XXXXXXXXXXXXXXXXXXXXX>, relay=local, delay=3394, status=deferred (Command died with signal 6: "/home/spc/web/sites/boston.conman.org/htdocs/boston.cgi --config /home/spc/web/sites/boston.conman.org/journal/boston.cnf --email --cmd new". Command output: *** glibc detected *** double free or corruption (fasttop): 0x0934bd38 *** )
Jan 11 07:07:09 brevard postfix/local: 084412EB72E4: to=<XXXXXXXXXXXXXXXXXXXXX>, relay=local, delay=7394, status=sent (delivered to command: /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi --config /home/spc/web/sites/boston.conman.org/journal/boston.cnf --email --cmd new)

The fact that four showed up showed that mod_blog crashed when exiting. I find it disturbing that I've been running the latest version since the 1st and this is the first time it crashed. Not only that, but it crashed three times in a row then worked.

Such bugs are terrifying.

So, inspired by the crash reports in Redis, I decided to roll my own and it was surprisingly easy given the output (from a test of the code):

CRASH: pid=6103 signal='Aborted'
CRASH: reason='Unspecified/untranslated error'
CRASH: CS=0073 DS=C02E007B ES=007B FS=0000 GS=0033
CRASH: EIP=00B767A2 EFL=00000246 ESP=BFE162F8 EBP=BFE1630C ESI=000017D7 EDI=00CBBFF4
CRASH: EAX=00000000 EBX=000017D7 ECX=000017D7 EDX=00000006
CRASH: UESP=BFE162F8 TRAPNO=00000000 ERR=00000000
CRASH: STACK DUMP
CRASH: BFE162F8:                         C5 78 BB 00 F4 BF CB 00 
CRASH: BFE16300: F4 BF CB 00 00 00 00 00 C0 C6 F0 B7 3C 64 E1 BF 
CRASH: BFE16310: 29 93 BB 00 06 00 00 00 20 63 E1 BF 00 00 00 00 
CRASH: BFE16320: 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
CRASH: BFE16330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
CRASH: BFE16340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
CRASH: BFE16350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
CRASH: BFE16360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
CRASH: BFE16370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
CRASH: BFE16380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
CRASH: BFE16390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
CRASH: BFE163A0: A7 C4 CB 00 A8 C4 CB 00 F4 BF CB 00 A7 C4 CB 00 
CRASH: BFE163B0: 60 C4 CB 00 D8 63 E1 BF 73 D0 C4 00 F4 BF CB 00 
CRASH: BFE163C0: 4F E7 BE 00 02 00 00 00 B8 C6 54 08 42 00 00 00 
CRASH: BFE163D0: B8 C6 54 08 42 00 00 00 14 64 E1 BF 9C E9 BE 00 
CRASH: BFE163E0: 60 C4 CB 00 B8 C6 54 08 42 00 00 00 32 30 31 34 
CRASH: BFE163F0: A7 C4 CB 00 00 00 00 00                         
CRASH: STACK TRACE
CRASH:        /home/spc/source/boston/build/boston[0x805d3c4]
CRASH:        /home/spc/source/boston/build/boston[0x805daa2]
CRASH:        /lib/tls/libc.so.6[0xbb79b0]
CRASH:        /lib/tls/libc.so.6(abort+0xe9)[0xbb9329]
CRASH:        /lib/tls/libc.so.6(__assert_fail+0x101)[0xbb0e41]
CRASH:        /home/spc/source/boston/build/boston[0x8053f30]
CRASH:        /home/spc/source/boston/build/boston[0x805d0b9]
CRASH:        /home/spc/source/boston/build/boston(ChunkProcessStream+0xc8)[0x805d32e]
CRASH:        /home/spc/source/boston/build/boston(ChunkProcess+0xcb)[0x805d240]
CRASH:        /home/spc/source/boston/build/boston(generic_cb+0x74)[0x8053a68]
CRASH:        /home/spc/source/boston/build/boston(pagegen_days+0x1c4)[0x8057f5c]
CRASH:        /home/spc/source/boston/build/boston(generate_pages+0xd0)[0x8057bfc]
CRASH:        /home/spc/source/boston/build/boston[0x80501bb]
CRASH:        /home/spc/source/boston/build/boston(main_cli+0x1e5)[0x80500cd]
CRASH:        /home/spc/source/boston/build/boston(main+0x133)[0x8051eb3]
CRASH:        /lib/tls/libc.so.6(__libc_start_main+0xd3)[0xba4e93]
CRASH:        /home/spc/source/boston/build/boston[0x804cdbd]

POSIX allows you to register a signal handler with additional contextual information. This contextual information includes the contents of the CPU registers and it's just a matter of finding the structure definition in a header file. Once I have the registers, it's not hard to get the stack dump.

The stack trace was even eaiser than that. It seems that the Linux C library comes with a few functions to do just that—backtrace() collects the return addresses off the stack; backtrace_symbols() will resolve the addresses back to function names and backtrace_symbols_fd() does the same, only it writes the informtaion to a file. I ended up using backtrace_symbols_fd() since I'm executing in the context of a signal handler, and I know that open(), read() and close() are marked as “async-signal-safe” (that is, safe to call in a signal handler) by POSIX, but malloc() and free() are not, and backtrace_symbols() is known to call malloc() (and thus, I would need to call free()) and I don't want to risk a crash in the crash report code. I'm taking a risk because backtrace() and backtrace_symbols_fd() could call non-async-signal-safe functions, but I'm doing as much as I can to get as much context as I can as safely as I can.

Obligatory Picture

[I'm wearing a goatee—I must be my evil twin brother.]

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: http://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

http://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2014 by Sean Conner. All Rights Reserved.

Listed on BlogShares