Tuesday, March 24, 2009
Rube Goldberg would have love modern software development
“In a system of a million parts, if each part malfunctions only one time out of a million, a breakdown is certain.”
Stanislaw Lem
“Project: Leaflet” is a web based application written in PHP, so there's a dependency on a webserver (in our case, Apache) and PHP itself. There's also a database involved, so throw in MySQL, and, oh heck, Linux to run the whole mess, and you have a pretty standard LAMP stack going on.
Although in our case, we use PostgreSQL (and yes, I have two active versions of “Project: Leaflet”) so that means we have a pretty standard LAPP stack going on.
Now, “Project: Leaflet” is designed to help manage mass emails, like a
newsletter, or a product announcement, or perhaps to notify customers of an impending company name
change, so it requires an SMTP server of some sort. At first we used whatever
SMTP server came
with the distribution of Linux we used (at first sendmail
but now we also use
postfix
).
Then came the automatic handling of bounced emails, which meant a bit more involvement with the SMTP server (sending emails to a particular address to a program). It also meant parsing the incoming message and given that there really isn't a standard for bounce messages (or rather, quite a few, take your pick) the stock configuration for the SMTP server wasn't going to cut it anymore.
First off, we needed to enable address
extensions, and at least under postfix
, that's just adding
one line to the configuration, but it's still a change from the default
install (such a configuration may be enabled by default under
sendmail
, but we don't run “Project: Leaflet” with
sendmail
, so it's a moot point). The point of the address
extension is to encode the outgoing email address in our return address so
we don't actually have to parse the body of the bounce.
It also helps with handling confirmation replies.
But there's still the issue of parsing the headers from the incoming
email, and well, I know procmail
, and I know that's what
it's there for, to help in dealing with incoming emails. I mean, it's there
right? (Well, actually, not by default any more. And that means yet more
changes to the SMTP configuration to enable procmail
processing). And yes, the syntax is horrible, but it's the very model of
clarity compared to some other syntaxes you come across on a Unix system
(like, oh, sendmail.cf
anyone?).
So now the dependencies for “Project: Leaflet” are: Linux, Apache,
PostgreSQL, PHP, sendmail
or postfix
(with some
custom configurations) and procmail
.
Oh, and then there's the change from a few weeks ago, where I made it such that a single copy of the code can serve multiple instances (clients) on a single server. I did that not to save disk space (heaven's no! The uncompressed source code doesn't even come close to a megabyte of disk space) but administrative overhead—I can update the code in one location per server instead of, say, 25 individual copies.
But that necessitated some changes in layout and assumed locations of files and well … let's just say that while a company could take our code base (and at one point the topic of releasing the code as Open Source™ did come up), it's less and less likely that an average website owner would be able to install this program, given the dependencies and somewhat custom configurations made thus far.
Now, this version (the “Install Once And Use Everywhere On A Server” version) mandated yet another “feature”—support of a per-client IP address.
We have the IP addresses. No problem there.
And Linux supports multiple IP addresses. No problem there.
And Apache supports multiple IP address. No problem there.
sendmail
/postfix
support multiple IP addresses. No probl—
Uh … hold on a second …
Er, mu.
Incoming email isn't an
issue—sendmail
/postfix
can be set up to listen
on all interfaces. It's outgoing email that's an issue.
See, it's all too easy for certain ISPs to blacklist IP addresses that send an excessive amount of spam (and that happens to us all the time), so if that happens, we want the customer's assigned IP address to get banned, and not the IP address of the server (otherwise, all our customers on said server have “an issue”).
But, when sendmail
/postfix
send email, they use
the default server IP address,
which isn't what we want.
So, mu.
After some searching, we found that exim will do what we want (great! Yet
another SMTP server we need to learn and support), but not out
of the box (nor will it support address extensions, virtual hosts or
procmail
, so there're quite a bit of configuration changes going
on).
So, this little PHP app I wrote requires Linux, Apache, PHP (of course),
PostgreSQL, exim
and procmail
. Talk about your
debugging nightmare.
Today, I spent several hours tracking down an issue where emails were
being sent out via the server's primary IP address and not the client's IP address. That meant I had to track
the issue through half the chain there, PHP, exim
and
procmail
. I ended up chasing (now that I look at it after the
fact) a potential red herring.
The problem was compounded by the fact that it worked for the test
account.
I fixed the issue by replacing the call to the PHP mail()
function (which calls the external program sendmail
—which
isn't really sendmail
but a simple replacement
provided by exim
, which injects the email directly into the
outgoing queue) with a bunch of PHP code that talks to the local email
server via SMTP.
That worked. Only later, I realized that the client was using the incorrect email address for sending emails, which was probably why the primary IP address was showing up when it shouldn't (and if that makes no sense to you, Welcome to My World!).
It certainly feels as if the modern web development world is held together with chewing gum and bailing wire (and we're all out of chewing gum) and nobody really understands what's going on underneath the covers (and frankly, are as scared of it as they are of seeing how sausage is made).