The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Saturday, July 27, 2024

Fixing more Apache errors

A week later and I can't say I cleared up all the errors with my web server:

[Sat Jul 27 09:51:45.349454 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.349617 2024] [cgid:error] [pid 7348:tid 3807226800] [client 192.200.113.155:55207] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.350209 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.350297 2024] [cgid:error] [pid 7348:tid 3807226800] [client 192.200.113.155:55207] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.352660 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.352814 2024] [cgid:error] [pid 7636:tid 3815619504] [client 192.200.113.155:49997] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.353377 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.353462 2024] [cgid:error] [pid 7636:tid 3815619504] [client 192.200.113.155:49997] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.353790 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.353943 2024] [cgid:error] [pid 7691:tid 3832404912] [client 192.200.113.155:48697] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.354685 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.354813 2024] [cgid:error] [pid 7691:tid 3832404912] [client 192.200.113.155:48697] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.360184 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.360295 2024] [cgid:error] [pid 7349:tid 3731692464] [client 192.200.113.155:44083] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.360856 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.360940 2024] [cgid:error] [pid 7349:tid 3731692464] [client 192.200.113.155:44083] End of script output before headers: boston.cgi
[Sat Jul 27 09:51:45.366567 2024] [cgid:error] [pid 7353:tid 4149003968] (12)Cannot allocate memory: AH01252: couldn't create child process: 12: boston.cgi
[Sat Jul 27 09:51:45.366719 2024] [cgid:error] [pid 7786:tid 3916331952] [client 192.200.113.155:55205] End of script output before headers: boston.cgi

There are more entries like this, but you get the idea. Apache can't run mod_blog for some reason. Checking the access log I can match these up to the following requests:

XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/16/hsr-carpet-1.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/15/flapper.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/17/scrapes.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/16/hsr-carpet-2.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/17/luxor.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/rushhour.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/area51.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/littlealeinn.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)
XXXXXXXXXXXXXXX - - [27/Jul/2024:09:51:45 -0400] "GET /2006/07/18/quik-pik.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43" -/- (-%)

To me, this is obviously a crawler, despite claiming to be every possible web browser in existance—is it Windows? MacOS? Linux? Yes. But what's interesting is that all the errors seem related to serving up images.

The way my blog works, all requests to posts are fed through mod_blog, and if said request is for an image, it just copies the file out. It works, but if the server gets slammed just a bit too hard, it breaks down. If only there was some way to get Apache to serve the images directly instead of having to go through mod_blog.

Wait! There is!

I've been using Apache for well over twenty-five years now, so it was a relatively easy issue to solve. First off, point Apache to the directory where all the data for mod_blog is stored.

Alias                 /XXXXX/ /home/spc/web/sites/boston.conman.org/journal/

<Directory /home/spc/web/sites/boston.conman.org/journal>
  Options       None
  AllowOverride None
  <LimitExcept GET HEAD>
    Require valid-user
  </LimitExcept>
</Directory>

The first directive maps the “web directory” /XXXXX/ to the actual directory on the file system. The Directory block restricts what can be viewed and how it can be viewed. All that's left is to throw all requests to images to this directory:

RewriteRule ^([0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]/.*\.(gif|png|jpg|ico)) XXXXX/$1 [L]

What this does is rewrites a request like /2015/07/04/Desk.jpg to /XXXXX/2015/07/04/Desk.jpg, which references the image directly on the file system, letting Apache serve it up directly. This rule goes before the other RewriteRules so Apache serves the image up before mod_blog sees the request.

An easy fix that should lighten the load on Apache as it serves up my blog. I'll see in a week if it all goes to plan.

Obligatory Picture

An abstract representation of where you're coming from]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.