The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Saturday, March 01, 2025

Fixing a 27 year old bug that only now just got triggered

I will, from time to time, look at various logs for errors. And when I looked at the error log for my web server, intermixed with errors I have no control over like this:

[Tue Feb 25 10:41:19.504140 2025] [ssl:error] [pid 16571:tid 3833293744] [client 206.168.34.92:47678] AH02032: Hostname literature.conman.org provided via SNI and hostname 71.19.142.20 provided via HTTP have no compatible SSL setup
[Tue Feb 25 12:39:33.768053 2025] [ssl:error] [pid 16408:tid 3892042672] [client 167.94.146.59:50798] AH02032: Hostname hhgproject.org provided via SNI and hostname 71.19.142.20 provided via HTTP have no compatible SSL setup
[Sat Mar 01 05:34:44.029898 2025] [core:error] [pid 21954:tid 3841686448] [client 121.36.96.194:53710] AH10244: invalid URI path (/cgi-bin/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/.%2e/bin/sh)
[Sat Mar 01 05:34:45.077056 2025] [core:error] [pid 23369:tid 3875257264] [client 121.36.96.194:53722] AH10244: invalid URI path (/cgi-bin/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/bin/sh)

I found a bunch of errors that I found concerning:

[Sun Feb 23 10:14:54.644036 2025] [cgid:error] [pid 16408:tid 3715795888] [client 185.42.12.144:51022] End of script output before headers: contact.cgi, referer: https://www.hhgproject.org/contact.cgi
contact.cgi: src/Cgi/UrlDecodeChar.c:41: UrlDecodeChar: Assertion `((*__ctype_b_loc ())[(int) ((*src))] & (unsigned short int) _ISxdigit)' failed.

It's obvious that a call to assert() failed in the function UrlDecodeChar() due to some robot failing to encode a web request properly. Let's see what the code is actually doing:

char UrlDecodeChar(char **psrc)
{
  char *src;
  char  c;
  
  assert(psrc  != NULL);
  assert(*psrc != NULL);
  
  src = *psrc;
  c   = *src++;
  if (c == '+')
    c = ' ';
  else if (c == '%')
  {
    assert(isxdigit(*src));
    assert(isxdigit(*(src+1)));
    c    = ctohex(*src) * 16 + ctohex(*(src+1));
    src += 2;
  }
  *psrc = src;
  return(c);
}

The problem was using assert() to check the results of some I/O—that's not what assert() is for. I think I was being lazy when I used those assertions and didn't bother with the proper coding practice of returning an error. Curious as to when I added this code, I checked the history and from December 3rd, 2004:

char UrlDecodeChar(char **psrc)
{
  char *src;
  int	c;

  ddt(psrc  != NULL);
  ddt(*psrc != NULL);

  src = *psrc;
  c   = *src++;
  if (c == '+')
    c = ' ';
  else if (c == '%')
  {
    ddt(isxdigit(*src));
    ddt(isxdigit(*(src+1)));
    c	 = ctohex(*src) * 16 + ctohex(*(src+1));
    src += 2;
  }
  *psrc = src;
  return(c);
}

The history in the current repository goes no further back due to losing my CVS repositories and it's interesting to see that this function is the same as it was back then (with the difference of using my own version of assert() called ddt() back in the day). Some further sluthing convinced me that I wrote this code back in 1997. This function is old enough to not only vote, be drafted, get drunk, and sign contracts, but be removed from its parents health insurance!

Good lord!

It's not how I would write that function today.

It's even more remarkable that I haven't seen this assert() trigger in all those years.

The fix was easy:

char UrlDecodeChar(char **psrc)
{
  char *src;
  char  c;
  
  assert(psrc  != NULL);
  assert(*psrc != NULL);
  
  src = *psrc;
  c   = *src++;
  if (c == '+')
    c = ' ';
  else if (c == '%')
  {
    if (!isxdigit(*src))   return '\0';
    if (!isxdigit(*src+1)) return '\0';
    c    = ctohex(*src) * 16 + ctohex(*(src+1));
    src += 2;
  }
  *psrc = src;
  return(c);
}

And propagating the error back up the call chain. This does result in a new major version for CGILib since I do follow semantic versioning since this is, technically speaking, a change in the public API even though this is less than 10 lines of code (out of 8,000+).

Obligatory Picture

Dad was resigned to the fact that I was, indeed, a landlubber, and turned the boat around yet again …

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

No AI was used in the making of this site, unless otherwise noted.

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2025 by Sean Conner. All Rights Reserved.