A while ago I wrote a program to go through the web log files, pulling out referers from search engines (more or less). It wouldn't quite run through an entire log file before quitting on some input it apparently didn't like. It hasn't bothered me that much until today. So I figured I would run it under the debugger, see what it doesn't like and fix the program.
It's amazing what garbage you get from search engines.
The spec for query strings is pretty straight forward—a series of name/value pairs separated by ampersands—“&” (except perhaps for the last pair) and each name/value pair is in the form of name “=” value. Pretty easy, right?
AOL/UK's search engine was sending a query string with two consecutive ampersands. Fixed that, go on to the next problem.
The dreaded referer string from MSN.
Usually when a variable that has no value is sent, what you get is:
Basically, the name, then the equal sign, and either the end of the query string, or the start (signified by the ampersand) of the next name/value pair. But not MSN.
Nope. You get:
And it was in the process of testing my work around that my Linux box seriously went dead.
Okay, so technically MSN didn't nearly killed my Linux system, my program did.
But still … it's an accessory to attempted murder!
Back to the drawing board.