Sunday, October 14, 2018
Tumbling through code, part III
I was going through my logs
(I've been vacation for the past two weeks)
and I noticed a few crashes of mod_blog
.
It was easy enough to determine that a call to assert()
was the culpret (the clue is highlighted):
CRASH(32421/000): pid=32421 signal='Aborted' CRASH(32421/001): reason='Unspecified/untranslated error' CRASH(32421/002): CS=B7EA0073 DS=007B ES=007B FS=0000 GS=0033 CRASH(32421/003): EIP=B7FE87A2 EFL=00000246 ESP=BFF9AE28 EBP=BFF9AE3C ESI=00007EA5 EDI=B7FAFFF4 CRASH(32421/004): EAX=00000000 EBX=00007EA5 ECX=00007EA5 EDX=00000006 CRASH(32421/005): UESP=BFF9AE28 TRAPNO=00000000 ERR=00000000 CRASH(32421/006): STACK DUMP CRASH(32421/007): BFF9AE28: A5 07 EB B7 00 00 00 00 F4 FF FA B7 00 00 00 00 CRASH(32421/008): BFF9AE38: C0 86 E8 B7 6C AF F9 BF 09 22 EB B7 06 00 00 00 CRASH(32421/009): BFF9AE48: 50 AE F9 BF 00 00 00 00 20 00 00 00 00 00 00 00 CRASH(32421/010): BFF9AE58: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/011): BFF9AE68: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/012): BFF9AE78: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/013): BFF9AE88: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/014): BFF9AE98: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/015): BFF9AEA8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/016): BFF9AEB8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/017): BFF9AEC8: 00 00 00 00 00 00 00 00 C7 04 FB B7 C8 04 FB B7 CRASH(32421/018): BFF9AED8: F4 FF FA B7 C7 04 FB B7 80 04 FB B7 08 AF F9 BF CRASH(32421/019): BFF9AEE8: 28 85 CA 08 F4 FF FA B7 9F 70 EE B7 02 00 00 00 CRASH(32421/020): BFF9AEF8: C8 78 CA 08 4C 00 00 00 C8 78 CA 08 4C 00 00 00 CRASH(32421/021): BFF9AF08: 44 AF F9 BF EC 72 EE B7 80 04 FB B7 C8 78 CA 08 CRASH(32421/022): BFF9AF18: 4C 00 00 00 27 00 00 00 C7 04 FB B7 00 00 00 00 CRASH(32421/023): STACK TRACE CRASH(32421/024): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x805ccf0] CRASH(32421/025): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x805d46b] CRASH(32421/026): /lib/tls/libc.so.6[0xb7eb0890] CRASH(32421/027): /lib/tls/libc.so.6(abort+0xe9)[0xb7eb2209] CRASH(32421/028): /lib/tls/libc.so.6(__assert_fail+0x101)[0xb7ea9d91] CRASH(32421/029): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi(max_monthday+0x5a)[0x80595a2] CRASH(32421/030): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi(tumbler_new+0xbcb)[0x805aa5a] CRASH(32421/031): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x8057f19] CRASH(32421/032): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi(main_cgi_get+0xbf)[0x8057c1a] CRASH(32421/033): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi(main+0x99)[0x804cb8d] CRASH(32421/034): /lib/tls/libc.so.6(__libc_start_main+0xd3)[0xb7e9dde3] CRASH(32421/035): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x804ca6d] CRASH(32421/036): COMMAND LINE CRASH(32421/037): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi CRASH(32421/038): ENVIRONMENT CRASH(32421/039): REDIRECT_STATUS=200 CRASH(32421/040): BLOG_CONFIG=/home/spc/web/sites/boston.conman.org/journal/boston.cnf CRASH(32421/041): HTTP_FROM=the.knowledge.ai@gmail.com CRASH(32421/042): HTTP_HOST=boston.conman.org CRASH(32421/043): HTTP_CONNECTION=Keep-Alive CRASH(32421/044): HTTP_USER_AGENT=The Knowledge AI CRASH(32421/045): HTTP_ACCEPT_ENCODING=gzip,deflate CRASH(32421/046): PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/X11R6/bin CRASH(32421/047): SERVER_SIGNATURE=<address>Apache/2.0.52 (CentOS) Server at boston.conman.org Port 80</address> CRASH(32421/048): SERVER_SOFTWARE=Apache/2.0.52 (CentOS) CRASH(32421/049): SERVER_NAME=boston.conman.org CRASH(32421/050): SERVER_ADDR=66.252.224.242 CRASH(32421/051): SERVER_PORT=80 CRASH(32421/052): REMOTE_ADDR=64.62.252.174 CRASH(32421/053): DOCUMENT_ROOT=/home/spc/web/sites/boston.conman.org/htdocs CRASH(32421/054): SERVER_ADMIN=sean@conman.org CRASH(32421/055): SCRIPT_FILENAME=/home/spc/web/sites/boston.conman.org/htdocs/boston.cgi CRASH(32421/056): REMOTE_PORT=36622 CRASH(32421/057): REDIRECT_URL=/2015/04-2015/ CRASH(32421/058): GATEWAY_INTERFACE=CGI/1.1 CRASH(32421/059): SERVER_PROTOCOL=HTTP/1.1 CRASH(32421/060): REQUEST_METHOD=GET CRASH(32421/061): QUERY_STRING= CRASH(32421/062): REQUEST_URI=/2015/04-2015/ CRASH(32421/063): SCRIPT_NAME=/boston.cgi CRASH(32421/064): PATH_INFO=/2015/04-2015/ CRASH(32421/065): PATH_TRANSLATED=/home/spc/web/sites/boston.conman.org/htdocs/2015/04-2015/ CRASH(32421/066): DONE
The hard part was trying to figure out which of
the three calls
to assert()
was being triggered.
Fortunately,
there was enough information logged to reproduce the error
(for the record, it was assert(month < 13)
).
Unfortunately,
it has to do with the tumbler parsing code.
One of the unique features of mod_blog
is the “entry addressing scheme,”
where you can address not only a single entry like 2018/10/14.1
but a range of entries like
2000/08/10.2-15.5
.
In fact,
the same code internally changes a reference like 2018/09
to
2018/09/11.1-09/30.1
(the first and last entry in the given month;
it also works for days and years).
When I wrote the code,
I had in mind a way of it working and the bug here is in my inattention to details in checking what I've received.
The code in question,
when it sees a request in the form of “number /
number -
number” is to assume that the number after the literal
“-” is a month and not a year.
“The Knowledge AI” program was making a request of 2015/04-2015
,
and max_monthday()
was being given an invalid month,
thus the assert(month < 13)
being false and triggering a crash.
That I can fix.
But I do question the programming of the “The Knowledge AI” crawler. I don't have any links in that form, and I'm not aware of any links on other pages of that form (in fact, that particular feature of entry addressing is not used that often, even by me) so I have to wonder how it got a link like that? Does it try randomly generating links to see what it gets? A bug in their code? It's inexplicable.