The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Sunday, October 14, 2018

Tumbling through code, part III

I was going through my logs (I've been vacation for the past two weeks) and I noticed a few crashes of mod_blog. It was easy enough to determine that a call to assert() was the culpret (the clue is highlighted):

CRASH(32421/000): pid=32421 signal='Aborted'
CRASH(32421/001): reason='Unspecified/untranslated error'
CRASH(32421/002): CS=B7EA0073 DS=007B ES=007B FS=0000 GS=0033
CRASH(32421/003): EIP=B7FE87A2 EFL=00000246 ESP=BFF9AE28 EBP=BFF9AE3C ESI=00007EA5 EDI=B7FAFFF4
CRASH(32421/004): EAX=00000000 EBX=00007EA5 ECX=00007EA5 EDX=00000006
CRASH(32421/005): UESP=BFF9AE28 TRAPNO=00000000 ERR=00000000
CRASH(32421/006): STACK DUMP
CRASH(32421/007):        BFF9AE28:  A5 07 EB B7 00 00 00 00 F4 FF FA B7 00 00 00 00
CRASH(32421/008):        BFF9AE38:  C0 86 E8 B7 6C AF F9 BF 09 22 EB B7 06 00 00 00
CRASH(32421/009):        BFF9AE48:  50 AE F9 BF 00 00 00 00 20 00 00 00 00 00 00 00
CRASH(32421/010):        BFF9AE58:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
CRASH(32421/011):        BFF9AE68:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
CRASH(32421/012):        BFF9AE78:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
CRASH(32421/013):        BFF9AE88:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
CRASH(32421/014):        BFF9AE98:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
CRASH(32421/015):        BFF9AEA8:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
CRASH(32421/016):        BFF9AEB8:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
CRASH(32421/017):        BFF9AEC8:  00 00 00 00 00 00 00 00 C7 04 FB B7 C8 04 FB B7
CRASH(32421/018):        BFF9AED8:  F4 FF FA B7 C7 04 FB B7 80 04 FB B7 08 AF F9 BF
CRASH(32421/019):        BFF9AEE8:  28 85 CA 08 F4 FF FA B7 9F 70 EE B7 02 00 00 00
CRASH(32421/020):        BFF9AEF8:  C8 78 CA 08 4C 00 00 00 C8 78 CA 08 4C 00 00 00
CRASH(32421/021):        BFF9AF08:  44 AF F9 BF EC 72 EE B7 80 04 FB B7 C8 78 CA 08
CRASH(32421/022):        BFF9AF18:  4C 00 00 00 27 00 00 00 C7 04 FB B7 00 00 00 00
CRASH(32421/023): STACK TRACE
CRASH(32421/024):        /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x805ccf0]
CRASH(32421/025):        /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x805d46b]
CRASH(32421/026):        /lib/tls/libc.so.6[0xb7eb0890]
CRASH(32421/027):        /lib/tls/libc.so.6(abort+0xe9)[0xb7eb2209]
CRASH(32421/028):        /lib/tls/libc.so.6(__assert_fail+0x101)[0xb7ea9d91]
CRASH(32421/029):        /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi(max_monthday+0x5a)[0x80595a2]
CRASH(32421/030):        /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi(tumbler_new+0xbcb)[0x805aa5a]
CRASH(32421/031):        /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x8057f19]
CRASH(32421/032):        /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi(main_cgi_get+0xbf)[0x8057c1a]
CRASH(32421/033):        /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi(main+0x99)[0x804cb8d]
CRASH(32421/034):        /lib/tls/libc.so.6(__libc_start_main+0xd3)[0xb7e9dde3]
CRASH(32421/035):        /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x804ca6d]
CRASH(32421/036): COMMAND LINE
CRASH(32421/037):        /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi
CRASH(32421/038): ENVIRONMENT
CRASH(32421/039):        REDIRECT_STATUS=200
CRASH(32421/040):        BLOG_CONFIG=/home/spc/web/sites/boston.conman.org/journal/boston.cnf
CRASH(32421/041):        HTTP_FROM=the.knowledge.ai@gmail.com
CRASH(32421/042):        HTTP_HOST=boston.conman.org
CRASH(32421/043):        HTTP_CONNECTION=Keep-Alive
CRASH(32421/044):        HTTP_USER_AGENT=The Knowledge AI
CRASH(32421/045):        HTTP_ACCEPT_ENCODING=gzip,deflate
CRASH(32421/046):        PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/X11R6/bin
CRASH(32421/047):        SERVER_SIGNATURE=<address>Apache/2.0.52 (CentOS) Server at boston.conman.org Port 80</address> 
CRASH(32421/048):        SERVER_SOFTWARE=Apache/2.0.52 (CentOS)
CRASH(32421/049):        SERVER_NAME=boston.conman.org
CRASH(32421/050):        SERVER_ADDR=66.252.224.242
CRASH(32421/051):        SERVER_PORT=80
CRASH(32421/052):        REMOTE_ADDR=64.62.252.174
CRASH(32421/053):        DOCUMENT_ROOT=/home/spc/web/sites/boston.conman.org/htdocs
CRASH(32421/054):        SERVER_ADMIN=sean@conman.org
CRASH(32421/055):        SCRIPT_FILENAME=/home/spc/web/sites/boston.conman.org/htdocs/boston.cgi
CRASH(32421/056):        REMOTE_PORT=36622
CRASH(32421/057):        REDIRECT_URL=/2015/04-2015/
CRASH(32421/058):        GATEWAY_INTERFACE=CGI/1.1
CRASH(32421/059):        SERVER_PROTOCOL=HTTP/1.1
CRASH(32421/060):        REQUEST_METHOD=GET
CRASH(32421/061):        QUERY_STRING=
CRASH(32421/062):        REQUEST_URI=/2015/04-2015/
CRASH(32421/063):        SCRIPT_NAME=/boston.cgi
CRASH(32421/064):        PATH_INFO=/2015/04-2015/
CRASH(32421/065):        PATH_TRANSLATED=/home/spc/web/sites/boston.conman.org/htdocs/2015/04-2015/
CRASH(32421/066): DONE

The hard part was trying to figure out which of the three calls to assert() was being triggered. Fortunately, there was enough information logged to reproduce the error (for the record, it was assert(month < 13)). Unfortunately, it has to do with the tumbler parsing code.

One of the unique features of mod_blog is the “entry addressing scheme,” where you can address not only a single entry like 2018/10/14.1 but a range of entries like 2000/08/10.2-15.5. In fact, the same code internally changes a reference like 2018/09 to 2018/09/11.1-09/30.1 (the first and last entry in the given month; it also works for days and years). When I wrote the code, I had in mind a way of it working and the bug here is in my inattention to details in checking what I've received.

The code in question, when it sees a request in the form of “number / number - number” is to assume that the number after the literal “-” is a month and not a year. “The Knowledge AI” program was making a request of 2015/04-2015, and max_monthday() was being given an invalid month, thus the assert(month < 13) being false and triggering a crash. That I can fix.

But I do question the programming of the “The Knowledge AI” crawler. I don't have any links in that form, and I'm not aware of any links on other pages of that form (in fact, that particular feature of entry addressing is not used that often, even by me) so I have to wonder how it got a link like that? Does it try randomly generating links to see what it gets? A bug in their code? It's inexplicable.

Obligatory Picture

An abstract representation of where you're coming from]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.