The Boston Diaries

Monday, April 28, 2008

Funny how I haven't seen this anti-spam technique bandied about

The past month has been a continual fight with email. It got to the point where I sat down and designed an entirely new email system in the hopes that it would stop spam once and for all (based upon some ideas from Dan Bernstein) and since I've been mulling over it, trying to find flaws in it.

And find I did.

The system involves three players, the mail client (MC—an MUA in SMTP talk), an outgoing mail server (OMS) and an incoming mail server (IMS—under SMTP, there's a single server, called an MTA, that handles both). There's one protocol between an MC and an OMS, another protocol between an MC and an IMS, and two protocols for communication between an OMS and an IMS. There were also restrictions about what can talk to what; an OMS can only talk to the designated IMS for a domain (much like sending email to an MX record). Conversely, an IMS can only accept connections from a designated OMS from the sender domain (much like what SPF tries to do). An MC needs to be authenticated to the IMS/OMS (much like you need authentication for POP and IMAP to receive email, and some sites now require SMTP AUTH for outgoing email).

Yes, I'm glossing over a lot of details here, but that's the overview. ISPs would still filter mail client traffic, much like they do now. The enforcement of the sending server would pretty much stop joe jobs, and using a notification scheme with the mail spooled on the sender side would eliminate most, if not all, bounce messages.

So far, it seemed like a great scheme.

Until I realized that spammers would then just register tons of domains (or cut deals with domain name registrars to use “just expired but we'll keep it active until we can sell it again” domains) to send spam.

So the only thing I really did was find a way to stop joe jobs; it doesn't really stop spam all that much, and thus, the flaw.

But one remark from Wlofie (I ran the whole system past him a few days ago) lept out at me—server signatures (one of the optional bits in one of the protocols was a digital signature of the sender; Wlofie suggested I include a digital signature of the server as well). We already have server signatures for websites. And when I realized that, I realized a solution for spam. And one that can be adapted with our current email system.

First, revise the SMTP specification. Remove literal network addressing—that is, the ability to send email to an arbitrary IP address is no longer allowed. If the host portion of an email address does not have an MX record, the email can't be delivered. On the recpient side, make the use of SPF records mandatory, and they must be checked. Also, revise the SPF specification to remove the “SoftFail” and “Neutral” results.

This last step is the controversial one (as if the others weren't already)—SMTP servers must have a signed secure certificate and the protocol must be run over an encrypted channel, similar to how HTTPS works. And if either side has an expired or revoked certificate, the other side must refuse email.

What does this gain us?

Accountability.

Getting spam from a few hundred domains? Find out who sold them the signed certificate and send the complaint there. After a few hundred (thousand? Hundred thousand?) complaints and the easiest way handle the situation is to revoke the signature. Sure, the spammer can try bribing the certificate authority, but that's exactly what's missing from today's anti-spam techniques—hitting the spammer where it hurts! And if the spammer tries to use a self-signed certificate? Who would trust it?

Sure, it's an expense to get a signed certificate, but in today's reality, you are either an individual using someone else's server for email (Gmail, Yahoo, your ISP) or you're a business and can afford it (as part of your hosting bill, or just outright, but hey, it's a business expense and can be written off).

I must have forgotten my cardboard programmer that day …

D'oh!

Smirk called today, saying a customer had a problem sending mail with one of their PHP scripts. The server in question was running my PHP/sendmail wrapper and the testing that Smirk did showed that the PHP mail() function wasn't returning anything! Funny, for a function that supposedly returns a bool …

With Wlofie playing the part of cardboard programmer, I did some testing, found that indeed, there was a problem—at first, it looked like the system was terminating the program with random signals. One time it would terminate with a SIGXFSZ, then with SIGTTIN, then with SIGWINCH!

I then stared at the code until I bled …

else /* parent process */
{
  pid_t cstat;
  int   status;

  cstat = waitpid(child,&status,0);

  if (WIFEXITED(cstat))
    rc = WEXITSTATUS(cstat);
  else
    rc = EXIT_FAILURE;

  unlink(tmpfile); /* make sure we clean up after ourselves */
  exit(rc);
}

It was then i saw my mistake—I was checking the wrong variable!

Sigh.

The type of mistake a statically typed language should catch. And before you say “unit testing” my tests were basically “did the email go through? Yup? Then it works”—the thought to check the return code of my program as a whole didn't occur to me (hey, the email got through, right? that meant that it worked).

I changed WIFEXITED(cstat) to WIFEXITED(status) and WEXITSTATUS(cstat) to WEXITSTATUS(status) and it worked.

I also checked PHP, and for the life of me, I can't figure out why it was returning undef, but then again, PHP is the scripting language du jour so it may be I didn't check the precise version we're running.