Wednesday, February 13, 2008
Another one of those leaky abstractions
It was something that should have been easy.
Earlier this week, some spammer found a PHP script on one of our servers
that allowed him unrestricted access to send spam. Two times our server had
maxed out at 100Mbps
sustained output, and it was after this second attempt that I learned that
the problem could be easily solved by adding the mail()
function to the disable_functions
directive in the
php.ini
file. This has the nice benefit of not allowing
any PHP script to send mail. Unfortunately, our customers don't
see this as a nice benefit, so it's not a long-term solution.
So we need to allow such PHP scripts to run. But the problem we (okay, I) were (was) having was locating the PHP script (or scripts) being abused. When you have scores of sites on the server, isolating the one or two problem scripts is not a trivial problem.
But P found another directive in the php.init
file—sendmail_path
. So a simple program (ha!) could be
written to log some critical information and pass execution along to
sendmail
, and thus we could finally locate the problematic PHP
scripts.
After thinking about the problem for a bit, I came up with the basics of the script (in pseudocode):
main() { string input = STDIN; extract To:, Cc: Bcc: headers from input; extract HOSTNAME environment variable; extract PWD environment variable; log To, Cc, Bcc, hostname, pwd in,out = pipe(); /* create a unidirectional data pipe */ fork(); /* creates a new process */ if (parent-process) { write(out,input); waitfor(child); exit; } if (child-process) { set STDIN to in; exec(sendmail); } }
When I tested the program on my workstation, it worked.
So I installed the program on the server in question.
It didn't work.
Oh, it worked when I tested a sample PHP script from the command line, but it failed when executed from the webserver.
Now, the major differences between my workstation and the server are:
- My workstation is a virtual server. The server is not.
- My workstation runs Postfix. The server runs Sendmail.
- My workstation does not have a control panel. The server does.
Any one of those could be the culprit.
Okay, so let's make a simpler program. Over the course of an hour, I ended up with:
main() { exec(sendmail); }
And that still wasn't working through the webserver when P asked a rather stupid question: “Is it a permissions problem?”
The answer was even stupider—yes—it was a permission problem. The location I had selected for the program wasn't accessible from the webserver.
Fix that problem, and now the program just hangs (but does log what I asked it to log).
Well, rather, sendmail
was hanging.
And then major surgery on my program started.
Okay, maybe sendmail
is attempting to write something and
hanging there, so read anything sent back from sendmail
—still
hanging.
Okay, maybe sendmail
is still expecting more input. I close
my side of the pipe after writing—still hanging.
Okay, it looks like my program is hanging trying to read anything being
sent by sendmail
, so register a signal handler to catch
SIGCHLD
(a signal sent when a child process exits) so I can
break out of the read()
call and clean up—nope.
Maybe it's the code that's reading stdin
—maybe I'm not
handling that correctly—nope.
Run gdb
on the spawned sendmail
program (I was
getting really desperate at this point). Hmm … it's stuck in the
read()
system call.
That shouldn't be happening. I'm closing my side of the data it's receiving. Unless it's not noticing that the pipe—
AH HAH!
Let me check something—PHP is envoking sendmail
with the
-i
option:
- -i
- Ignore dots alone on lines by themselves in incoming messages. This should be set if you are reading data from a file.
sendmail manpage
Hmmm …
Pipes under Unix are not the same as files. Sure, they can be treated as files for the most part, but there are some instances where the abstraction breaks down, and I was hitting such a breaking point.
When reading a file (as in, a real file off a disk), the
read()
system call returns the number of bytes read, but at the
end of the file, it just returns a 0 to indication no more data. But a pipe
doesn't quite work the same way. Once a pipe empties, the next call to
read()
will cause the calling process to wait until there's
more data in the pipe, since a pipe has two ends—a reading end and a
writing end.
And for some reason, the fact that my wrapper program was closing its end
of the pipe wasn't enough to signal to sendmail
that there was
more data. When my wrapper program closed its side of the pipe, the
operating system should have sent the signal SIGPIPE
to
sendmail
, but if sendmail
explictily ignores
SIGPIPE
then it never gets the signal that there's no more
input.
Regardless of what sendmail
was doing, it was expecting more
input from a pipe that was closed.
A change to the program:
main() { copy STDIN to tempfile; extract To:, Cc:, Bcc: headers from tempfile; extract HOSTNAME environment variables; extract PWD environment variables; log To, Cc, Bcc, hostname, pwd fork(); if (parent-process) { waitfor(child); exit; } if (child-process) { set STDIN to tempfile; exec(sendmail); } }
and it worked as expected.
Sigh.
Anyway, if anyone else needs such a program, I've released the code.
Update on Monday, April 18th, 2022
I've since taken the code down.