Thursday, July 22, 2010
Those email server blues
I'm concerned that eventually it will no longer be possible to run a private email server and that everyone will end up using Gmail, Yahoo or MySpaceFaceBook because that's the only way we will be able to get email.
Occasionally Dad will call asking why his email to me is bouncing, and every time I check, it's because AOL is taking the forced transitory failure (as generated by my greylist daemon) as “I can't deliver this in one shot, so of course that email address is bogus.” So I've had to whitelist all of AOL.
I had a similar problem with MyFaceSpaceBook. One or two transitory failures and my email address is considered bogus. Another whole swath of IP addresses whitelisted.
Then Corsair writes in about his emails to be being bounced.
Sigh.
Corsair's case I can't really figure out. From the logs:
Jul 18 04:28:36 brevard gld: [98587] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 18 08:28:36 brevard gld: [98799] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 18 12:28:37 brevard gld: [99052] tuple: [XXXXXXXX.194 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 18 16:28:37 brevard gld: [99309] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 18 20:28:38 brevard gld: [99491] tuple: [XXXXXXXX.194 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 00:28:38 brevard gld: [99675] tuple: [XXXXXXXX.194 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 04:28:39 brevard gld: [99944] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 08:28:39 brevard gld: [100234] tuple: [XXXXXXXX.194 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 12:28:40 brevard gld: [100509] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 14:00:38 brevard gld: [100595] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 14:01:09 brevard gld: [100596] tuple: [XXXXXXXX.194 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 14:05:38 brevard gld: [100604] tuple: [XXXXXXXX.194 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 14:06:09 brevard gld: [100605] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 14:13:09 brevard gld: [100610] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 14:13:40 brevard gld: [100613] tuple: [XXXXXXXX.194 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 14:24:24 brevard gld: [100629] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 14:41:18 brevard gld: [100641] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] GRAYLIST GREYLIST Jul 19 15:06:38 brevard gld: [100678] tuple: [XXXXXXXX.195 , XXXXXXXXXXXXXXXXXXXXXXX , sean@conman.org] ACCEPT WHITELIST
His email should have gone through on the second attempt, as it was only four hours, which is less than the six hours it takes to purge unreferenced tuples. Others I can explain; the third becuase it's from a different IP address, and the fourth becuse it definitely is past the six hour lifetime of an unreferenced tuple. Same for the fifth, but again, I can't explain why it wasn't accepted by the sixth attempt.
My initial thought was that it had something to do with searching the tuple list. I recently rewrote the binary search code so it was not only half the size, but much clearer, but maybe it doesn't work. Maybe I missed some subtle boundary condition.
50,000,000 tests later (no, really!), and no, both the old binary search routine and the new binary search routine return identical results. If there is a corner case, 50,000,000 random tests was not enough to reveal it. So I doubt it's that code.
About the only thing I can think of (and I haven't tested this) is that the timeout for old tuples is not what I think it is, but when I query my greylist daemon it returns a value of six hours for the lifetime of a greylisted tuple.
In any case, I whitelisted Corsair's email address. And I'm pondering why I even run my own email server any more …