The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Monday, May 11, 2015

SPF might not be worth handling, but what about RBL?

A month ago, I re-evaluated the use of SPF as an anti-spam measure and found it wanting. Today, I decided to re-evaluate my stance on the various real-time blackhole lists that exist. I was relunctant to use an RBL because of over-aggresive classification for even the smallest of infractions could lead to false positives (wanted email being rejected as spam). It has been over a decade since I first rejected the idea, and I was curious to see just how it would all shake out.

I used the Wikipedia list of RBLs as a starting point, figuring it would be pretty up-to-date. I then dumped information from my greylist daemon. The idea is to see how much additional spam would be caught if, after getting a “GO!” from the greylist daemon, I do a RBL check.

Out of the current 2,830 entries, only 145 had not been whitelisted. I didn't filter these out before running the test, but I don't think it would throw off the results too much. Half an hour of coding later, and I had a simple script to query the various RBLs for each unique IP address (1,446). I let it run for a few hours, as it had quite a few queries to make (1,446 IP addresses, each one requiring one query to see if the IP address is a known spammer, and a possible second one for the reason, across 45 RBL servers—it took awhile).

First up, how many “spam” results did I get from each RBL:

Results from each RBL
RBLhitsreasons given

As you can see, some of them were not worth querying. Also, about list.quorum.toit's not straightforward to use that server as it always sent back a result even when the others did not. I ultimately decided that any result that only had a “hit” from to be “non-spam” because of the issues.

I then proceeded to pour through all 2,830 results.

Email classification from RBLs
Marked as SPAM273997%
Not marked as SPAM913%

And out of the 91 that was not marked as spam, only 7 were spam not marked by any of the RBLs. Not bad. But the real test is false positives—email marked as spam that isn't. And unfortunately, there were a few:

False positives

Now, I realize that some of my readers might very well consider email from Twitter or Facebook as spam, but hey, don't judge me!


Anyway, that's a problem for me. I will occasionally have issues with the greylisting in some cases (rare, but it does happen, and I have to explicitely authorize the email when I become aware of the issue) but it's even worse with this. For instance:!!!!!!!!!

It's hit-or-miss within the IP range Facebook uses to send email. This would make troubleshooting quite difficult. I could whitelist the problematic domains but for any new site I might want to receive email from, I would have to watch the logs very closely for issues like this. But it's not as bad as I thought it would be, and it would cut out a lot of the spam I do get. It's tempting.

I shall have to think about this.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site:, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.