Saturday, March 01, 2025
Fixing a 27 year old bug that only now just got triggered
I will, from time to time, look at various logs for errors. And when I looked at the error log for my web server, intermixed with errors I have no control over like this:
[Tue Feb 25 10:41:19.504140 2025] [ssl:error] [pid 16571:tid 3833293744] [client 206.168.34.92:47678] AH02032: Hostname literature.conman.org provided via SNI and hostname 71.19.142.20 provided via HTTP have no compatible SSL setup [Tue Feb 25 12:39:33.768053 2025] [ssl:error] [pid 16408:tid 3892042672] [client 167.94.146.59:50798] AH02032: Hostname hhgproject.org provided via SNI and hostname 71.19.142.20 provided via HTTP have no compatible SSL setup [Sat Mar 01 05:34:44.029898 2025] [core:error] [pid 21954:tid 3841686448] [client 121.36.96.194:53710] AH10244: invalid URI path (/cgi-bin/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/bin/sh) [Sat Mar 01 05:34:45.077056 2025] [core:error] [pid 23369:tid 3875257264] [client 121.36.96.194:53722] AH10244: invalid URI path (/cgi-bin/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/bin/sh)
I found a bunch of errors that I found concerning:
[Sun Feb 23 10:14:54.644036 2025] [cgid:error] [pid 16408:tid 3715795888] [client 185.42.12.144:51022] End of script output before headers: contact.cgi, referer: https://www.hhgproject.org/contact.cgi contact.cgi: src/Cgi/UrlDecodeChar.c:41: UrlDecodeChar: Assertion `((*__ctype_b_loc ())[(int) ((*src))] & (unsigned short int) _ISxdigit)' failed.
It's obvious that a call to assert()
failed in the function UrlDecodeChar()
due to some robot failing to encode a web request properly.
Let's see what the code is actually doing:
char UrlDecodeChar(char **psrc) { char *src; char c; assert(psrc != NULL); assert(*psrc != NULL); src = *psrc; c = *src++; if (c == '+') c = ' '; else if (c == '%') { assert(isxdigit(*src)); assert(isxdigit(*(src+1))); c = ctohex(*src) * 16 + ctohex(*(src+1)); src += 2; } *psrc = src; return(c); }
The problem was using assert()
to check the results of some I/O—that's not what assert()
is for.
I think I was being lazy when I used those assertions and didn't bother with the proper coding practice of returning an error.
Curious as to when I added this code,
I checked the history and from December 3rd, 2004:
char UrlDecodeChar(char **psrc) { char *src; int c; ddt(psrc != NULL); ddt(*psrc != NULL); src = *psrc; c = *src++; if (c == '+') c = ' '; else if (c == '%') { ddt(isxdigit(*src)); ddt(isxdigit(*(src+1))); c = ctohex(*src) * 16 + ctohex(*(src+1)); src += 2; } *psrc = src; return(c); }
The history in the current repository goes no further back due to losing my CVS repositories and it's interesting to see that this function is the same as it was back then
(with the difference of using my own version of assert()
called ddt()
back in the day).
Some further sluthing convinced me that I wrote this code back in 1997.
This function is old enough to not only vote,
be drafted,
get drunk,
and sign contracts,
but be removed from its parents health insurance!
Good lord!
It's not how I would write that function today.
It's even more remarkable that I haven't seen this assert()
trigger in all those years.
The fix was easy:
char UrlDecodeChar(char **psrc) { char *src; char c; assert(psrc != NULL); assert(*psrc != NULL); src = *psrc; c = *src++; if (c == '+') c = ' '; else if (c == '%') { if (!isxdigit(*src)) return '\0'; if (!isxdigit(*src+1)) return '\0'; c = ctohex(*src) * 16 + ctohex(*(src+1)); src += 2; } *psrc = src; return(c); }
And propagating the error back up the call chain. This does result in a new major version for CGILib since I do follow semantic versioning since this is, technically speaking, a change in the public API even though this is less than 10 lines of code (out of 8,000+).