The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Saturday, September 01, 2007

March of the Bugs

August was a bad month for me, programming wise.

The code I wrote for Mark was broken. Or rather, it worked under ideal conditions, not The Real World™, and I forgot that TCP connections can stream as little as one byte of data per packet. Not normally an issue because of buffering, but the code I was writing was the buffer implementation (which is pretty much turned on its head in Seminole due to the resource constraints in embedded systems).

Then there was the Graylist fiasco (just today, the code I've written only works on some machines, on others it fails due to some UDP issue I'm missing). And led into the problems I had with mod_blog and spending more time than I wanted to fixing those issues.

Sigh.

Hopefully September will be a better month programming wise.

Sunday, September 02, 2007

DS Action—Helping families dealing with Down Syndrome in San Diego

This entry is not so much a call to action (although if you are in the San Diego, California area they could certainly use the help) as it is a way to boost the visibility on the web of the DS Action, a resource website for helping families in San Diego dealing with Down Syndrome.

Yes, a rather odd thing to write about here, but someone on a mailing list I subscribe to has helped set up DS Action since one of his own kids has Down Syndrome, and it's the least I could do to help.

Monday, September 03, 2007

Ah, nothing like laboring on Labor Day

A day playing D&D (First Edition—old school here) at Casa New Jersey was only marred by having to deal with a hacked server at The Office.

The perpetrator was good. Uploading a script which, when run, changed to the root directory, deleted its own code (it was in Perl) and renamed its various processes after processes currently running on the box, so it was difficult to know which website (and the errant processes were running under the webserver userid) was compromised.

Even worse, the scripts were spamming! Both the email and the website kind.

Sigh.

Tuesday, September 04, 2007

Optimistic results in testing graylisting

Yup, August was definitely a bad month for coding.

Spent most of the day cleaning up the code for the greylist project. The first major problem was incorrectly calling recvfrom(). So while I got the data properly, the IP address of the remote side got munged (oh, I should mention that I'm writing a stand alone graylist server accessible over the network). The second major problem was a memory problem (not unexpected when writing in C) that would only show up after some 1,000 requests came in, but at least was very consistent (same place every time). It took a few hours to realize I confused the purpose of two different arrays.

Yeah, August—bad month.

On the good side though, the code is shaping up. The greylist concept works around three pieces of information—the sender's IP address, the sender email address and the recipient email address, stored as a tuple. I've been recording such tuples on my email server (testing the Postfix interface) and have a testbed of 21,200 tuples to test (I've also found out I average about 1,800 emails per day).

To stress test the program, I've been pumping all 21,200 tuples through the code as fast as possible (most of the run time is spent just logging what comes through), and under the worst settings (setting what I call the “embargo timelimit” to just one second instead of the recommended one hour), only 93 tuples (not emails mind you, just IP, sender and recipient) made it through to the whitelist.

That's only 0.4%.

Not bad.

Of course, my server is just accepting all incoming emails, so some spam could be coming from what I'm calling “legitimate servers” (“legitimate servers” are those that actually requeue and deliver the email at a later time) so the final amount might be a bit larger, but so far I'm optimistic this will drastically cut the amount of spam I receive.

Wednesday, September 05, 2007

Some notes from a running graylist server

Man, my email seems eerily quiet now that I'm running the greylist daemon.

I've also identified several problems—nothing related to the code per sé, but to some unintended consequences of competing anti-spam measures (I assume it's an anti-spam measure).

On at least two mailing lists I'm on, the sender address (the one given in SMTP) is unique for every message sent. And to make matters worse, one particular mailing list (it's a Yahoo Group) has come from over 50 different IP addresses. What I'm afraid of is the following scenario: a message M, comes from IP I1 with sender email address S1 getting told to try again later, and when it does, coming from IP I2 with sender email address S2 and thus, I never get the message (even if S doesn't change, the IP address might, and that will still causes problems).

To get around that, I've implemented an IP whitelist, but now the trick is identifying all the network blocks to whitelist. So far, I've whitelisted IP addresses from AOL (two /16 blocks), BellSouth (two /18 blocks), and Yahoo (a /18 block and three /19 blocks), plus some miscellaneous servers (like my server at Casa New Jersey, just in case).

Update on Thursday, September 6th, 2007 at 1:40 am

Yup, the mailing lists are going to be very problematic.

Thursday, September 06, 2007

More notes on a graylist daemon

Man, my email is still scarily empty. One thing to keep in mind—until yesterday, I did no filtering for spam. Nothing. I got it all. And now?

It's like spam doesn't even exist.

Even though I started yesterday (just before midnight) I really only have about twelve hours of actual stats since there were a few problems with the program (more on that in a bit). But, in the past twelve hours, 2,251 emails came in, and only six made it through to the whitelist, a rate of about 0.3%.

Not bad.

And of the six, two were spam, but only one got through because the other one was sent to a non-existant account.

Now, the problems.

Problem one—the default embargo time was set too high. I only found this out after running the program for an hour. So that was a restart.

Problem two—the program crashed when dumping the tuples, and I don't know why. The program responds to the signal USR1 by calling fork() (which creates a duplicate of the running process); the parent process then goes back to servicing requests while the child process generates the dump file and exits. I do it this way for two reasons; 1) the program can still process requests and 2) since the child process gets a separate copy of everything, it can dump the data without the parent changing the data as it runs. The dump went fine; it is when the child process ends that the program goes into the weeds.

Now, I've used similar code in another daemon I wrote, and it worked on the development server, but not on my email server. There is a slight difference in Linux kernels between the two, but the major difference between the two—the mail server is a virtual server. Perhaps the calls to _exit() were problematic on the virtual server? Change them to exit() (there's a slight difference between the two, the details of which aren't important, because—).

And that didn't work. Program still crashed (and mind you, each of these tests take just over an hour to do). It could just be a problem with fork() and signal semantics under the virtual server. To work around that problem, I just removed the call to fork() and have the single process do the dump (ignoring any requests in the meantime).

And that seems to have done the trick to keep it up and running.


A disturbing lack of spam poetry

The graylist daemon is working. And that's great, but there's one downside to it that I've just now realized: I'll no longer get spam poetry.

In other words—amusing subject lines, like:

(yes, they're all real subject lines from spam I've received over the years)

Alas, I guess I'll have to live without spam poetry from now on.

Friday, September 07, 2007

Some more stats about graylisting

Some stats from the graylist daemon. I'm automatically accepting emails from Yahoo, AOL and BellSouth, plus three other servers that run problematic mailing lists—all other emails are going through the graylisting process. I've been running the daemon for almost 48 hours, with an embargo time of one hour, and cleaning out records that haven't seen activity for 12 hours (I check this every five minutes).

Current Graylist statistics
tuples 2684
graylisted 8217
whitelisted 22
graylist expired 5533
whitelist expired 0

About 0.3% of all emails that are subjected to the graylist process make it through to the whitelist. Of those 22 that got through, eight were spam (amazing! Spammers using a real SMTP server!) and of those, four were delivered to an actual in-use email address.

Not bad, considering I was averaging a few hundred (maybe even as high as a thousand) per day.

Another interesting bit—I limit the size of email addresses to 108 characters and so far, only one address has been truncated, yet still made it to the whitelist (it was a notification from Linked In) in 2½ hours despite coming from four different IP addresses.

It appears to be working rather well, although I do have some notes about stuff I'd like to add:

Next up—getting this implemented on one of The Company's domains.

Saturday, September 08, 2007

Notes on an ideal integrated development system

I'm not a fan of IDEs. I grew up with the “edit-compile-run” cyle of development, and while I didn't always have a choice in the “compile” portion of things, I did in the “edit” portion, and over time became very picky about which editor I use. Because of that, whenever I did try an IDE, I invariably found the “edit” portion to be very painful, stuck in an editor that I wasn't used to; being forced to use an unfamiliar editor resulted in a vast loss of productivity and thus, I've never liked IDEs. So I stuck with the “edit-compile-run” cycle.

But the recent bout of programming I've done has made me wish for something better than the “edit-compile-run” cycle. And while IDEs have probably evolved since I last tried them in the late 80s, I don't think they've evolved enough to suit me.

What I'm about to describe is defintely “pie-in-the-sky” stuff. I'm not saying that IDEs must be this way—I'm just saying that this is what I would like in an IDE. Who knows? Maybe this won't work. Maybe it's unworkable. But I wouldn't mind seeing these features (at long as the editing could be configured to my liking).

A database I used in the early 80s that ran on a twin floppy PC. Written by Brian Berkowitz and Richard Ilson

Wonderful features were:

Cornerstone

The one feature of Corner stone that still strikes me as innovative is the separation of variables (or in this case, fields and tables) from their name. One could change the name of a variable without having to edit every other occurrence of that name. That's a very powerful feature, but to implement it in an IDE, that IDE would have to have intimate knowledge of the computer language being used.

A few years ago, I cleaned up the code in mod_blog. I had a bunch of global variables used throughout the codebase, all starting with “g_” (such as g_rssfile) but they weren't variables in the traditional sense, they were more or less “run-time settable constants” (to the rest of the codebase, the declaration for g_rssfile was extern const char *const g_rssfile). I decided that they needed a renaming to better reflect how I actually use them, and changed the majority of global variables to start with “c_”.

Talk about pain.

Each one required at minimum three edits—the declaration in a header file, the actual declaration, and the setting of said variable when the program starts up. If I had this feature, something that took maybe an hour could have been finished in a few minutes.

But mod_blog is a very small codebase—some 14,000 lines of code. Could such a feature scale to something like the Linux kernel? Or Firefox? Or even Windows Vista? I don't know. And how would you even implement something like that?

My guess—if you even hope to do something like this on multimillion line codebases, you may have to give up on storing the code as text and move on to some other internal format.

It's not like it's a new idea. Most forms of BASIC (you know, that horrible langauge made popular on 8-bit microprocessors of the 70s and 80s) were not stored as text but in a mixture of binary and text form (although you could get a pure text version of the code if you wanted it).

So, what happens if we get away from distinct text files? And hey, why not design (or redesign) a language while we're at it?

A common complaint about static typechecking languages is the requirement to declare all your variables. But if we're using an ideal IDE, one that understands the langauge we're programming in, why not take the work on type inference and use it during the editing phase?

Something like:

[Example 1]

The editing takes place on the right-hand side, whereas the IDE will track your variables and types on the left-hand side. In this simple example, we see that the IDE has determined that the function nth() takes an integer, and returns a constant string.

In this example:

[Example 2]

The IDE inferred that the function foo() will return either a constant string or a number, which is highlighted in red to indicate the conflict (not that it won't run depending upon the language—it's just highlighting the fact that this function will return one of two types). It also inferred that the parameters are of type “number” (doubles, floats, integers, what have you).

So, the IDE could be doing these types annotations for you, but why not the ability to further annotate the annotations? I don't see why you couldn't edit the left-hand side to, say, change the type the IDE detected, or even annotate further conditions:

[Example 3]

Here, we annotated that b is not to be 0, and the IDE then highlighted the code to say “hey, this can't happen.” The assumption here is, the compiler can then use the annotations to statically check the code, and if it can determine at compile time that b is 0, then flag a compilation error—otherwise it can insert the runtime code for us to check and raise an exception (or do the equivilent of assert()) at runtime.

(And if we have all this syntax and typechecking stuff going on, along with the ability to change variable and function names at will without having to re-edit a bunch of code, we might as well have the IDE compile the code as we write it—although on a huge codebase this may be impractical—just a thought)

I'm still not entirely sure how to present the source code though. Since this “pie-in-the-sky” IDE stores the source code in some internal format, the minimum “working unit” isn't a file. I want to say that the minimum “working unit” is a function (that's how the examples are presented), or maybe a group of related functions. Heck, at this stage, we could probably incorporate Literate Programming principles.

Another feature that I don't think any existing IDE has is revision control as part of the system. And like the editing portion (“I want my editor, not the crap one the IDE provides”), revision control is another area of contention (not only over say, CVS vs. SVN, but centralized vs. decentralized, file-based vs. content-based, commenting every change vs. commenting over a series of changes, etc.). But since I'm taking a “pie-in-the-sky” approach to IDEs, I'll include revision control from within it as well.

It would probably also help with managing slightly different versions of the code base. For instance, the original version of the graylist daemon had the following bit of code to generate a report (more or less pulled from another daemon I had written):

static void handle_sigusr1(void)
{
  Stream out;
  pid_t  child;
  size_t i;

  (*cv_report)(LOG_DEBUG,"","User 1 Signal");
  mf_sigusr1 = 0;

  child = fork();
  if (child == (pid_t)-1)
  {
    (*cv_report)(LOG_CRIT,"$","fork() = %a",strerror(errno));
    return;
  }

  out = FileStreamWrite(c_dumpfile,FILE_CREATE | FILE_TRUNCATE);
  if (out == NULL)
  {
    (*cv_report)(LOG_ERR,"$","could not open %a",c_dumpfile);
    _exit(0);
  }

  for (i = 0 ; i < g_poolnum ; i++)
  {
    LineSFormat(
        out,
        "$ $ $ $ $ $ $ $ L L",
        "%a %b %c %d%e%f%g%h %i %j\n",
        ipv4(g_tuplespace[i]->ip),
        g_tuplespace[i]->from,
        g_tuplespace[i]->to,
        (g_tuplespace[i]->f & F_WHITELIST) ? "W" : "-",
        (g_tuplespace[i]->f & F_GRAYLIST)  ? "G" : "-",
        (g_tuplespace[i]->f & F_TRUNCFROM) ? "F" : "-",
        (g_tuplespace[i]->f & F_TRUNCTO)   ? "T" : "-",
        (g_tuplespace[i]->f & F_IPv6)      ? "6" : "-",
        (unsigned long)g_tuplespace[i]->ctime,
        (unsigned long)g_tuplespace[i]->atime
    );
  }

  StreamFree(out);
  _exit(0);
}

It works on all the development servers, but not the actual server.

Sigh.

Next version:

static void handle_sigusr1(void)
{
  Stream out;
#ifdef CAN_DO_FORK
  pid_t  child;
#endif
  size_t i;

  (*cv_report)(LOG_DEBUG,"","User 1 Signal");
  mf_sigusr1 = 0;

#ifdef CAN_DO_FORK
  child = fork();
  if (child == (pid_t)-1)
  {
    (*cv_report)(LOG_CRIT,"$","fork() = %a",strerror(errno));
    return;
  }
#endif

  out = FileStreamWrite(c_dumpfile,FILE_CREATE | FILE_TRUNCATE);
  if (out == NULL)
  {
    (*cv_report)(LOG_ERR,"$","could not open %a",c_dumpfile);
#ifdef CAN_DO_FORK
    _exit(0);
#else
    return;
#endif
  }

  for (i = 0 ; i < g_poolnum ; i++)
  {
    LineSFormat(
        out,
        "$ $ $ $ $ $ $ $ L L",
        "%a %b %c %d%e%f%g%h %i %j\n",
        ipv4(g_tuplespace[i]->ip),
        g_tuplespace[i]->from,
        g_tuplespace[i]->to,
        (g_tuplespace[i]->f & F_WHITELIST) ? "W" : "-",
        (g_tuplespace[i]->f & F_GRAYLIST)  ? "G" : "-",
        (g_tuplespace[i]->f & F_TRUNCFROM) ? "F" : "-",
        (g_tuplespace[i]->f & F_TRUNCTO)   ? "T" : "-",
        (g_tuplespace[i]->f & F_IPv6)      ? "6" : "-",
        (unsigned long)g_tuplespace[i]->ctime,
        (unsigned long)g_tuplespace[i]->atime
    );
  }

  StreamFree(out);
#ifdef CAN_DO_FORK
  _exit(0);
#endif
}

Ugly as hell. But typical of “portable” C code. If, however, one could easily make alternative versions (or branches) to the code, then I could, say, branch the previous version into the “Can do fork” and the “Not a forking chance” versions, then all this #ifdef crap. And by removing all that #ifdef crap, it makes it easier to follow the code.

And if you need to see all the current versions?

I guess something like FileMerge could be used to view the different revisions (and if the minimum “working unit” is the function, we get very fine-grained revision control).

And I suppose, while I'm at it, the ability to not only debug from the IDE, but edit a running instance of the program wouldn't be asking too much, although doing so for any arbitrary language may be difficult to darn near impossible.


That guy on the $10 bill? He wanted to sell out the US to the banking class …

I borrowed Don't Know Much About History from Bunny, and I must say, reading it is making me feel better about the current administration, if only from a “the more things change, the more they stay the same” type of deal.

I'm not even half way through, but some choice bits—the first about Alexander Hamilton:

Hamilton was no “man of the people,” though. The masses, he said, were a “great beast.” He wanted a government controlled by the merchant and banking class, and the government under Hamilton would always put this elite class first …

The second major component of Hamilton's master plan was the establishment of a national bank to store federal funds safely; to collect, move, and dispense tax money; and to issue notes. The bank would be partly owned by the government, but 80 percent of the stock would be sold to private investors … This time there was no compromise, and President Washington went along with Hamilton.

In 1791, Hamilton had become involved with a Philadelphia woman named Maria Reynolds. James Reynolds, the husband of Maria, had begun charging Hamilton for access to his wife—call it blackmail or pimping. Reynolds then began to boast that Hamilton was giving him tips—“insider information,” in modern terms—that allowed him to speculate in government bonds. Accused of corruption, Hamilton actually turned over love letters from Maria Reynolds to his political enemies to prove that he might have cheated on his wife, but he wasn't cheating the government.

On the rivalry between Thomas Jefferson and Alexander Hamilton:

As part of their ongoing feud, both men supported rival newspapers whose editors received plums from the federal pie. Jefferson's platform was the National Gazette, and Hamilton's was the Gazette of the United States, both of which took potshots at the opposition. These were not mild pleasantries, either, but mudslinging that escalated into character assassination. More important, the feud gave birth to a new and unexpected development, the growth of political parties, or factions, as they were then called.

Gee, nothing new here. In fact, it's rather interesting to note that newspapers back then were less objective in reporting the news than today. Or perhaps they were more blatent about their biases back then.

And even the endless Presidential campaigning going on now, nearly a year before the elections, isn't new:

[John Quincy Adam's] administration was crippled from the start by the political furor over the “corrupt bargin,” and Adams never recovered from the controversy. The Tennessee legislature immediately designated Jackson its choice for the next election, and the campaign of 1828 actually began in 1825.

I just wish they taughtthis history rather than the watered-down claptrap they force down students today.


So does their Bible include The Book of Mozilla?

We at the Church of Google believe the search engine Google is the closest humankind has ever come to directly experiencing an actual God (as typically defined). We believe there is much more evidence in favour of Google's divinity than there is for the divinity of other more traditional gods.

We reject supernatural gods on the notion they are not scientifically provable. Thus, Googlists believe Google should rightfully be given the title of “God”, as She exhibits a great many of the characteristics traditionally associated with such Deities in a scientifically provable manner.

We have compiled a list of nine proofs which we believe definitively prove Google's title as God.

The Church of Google

I don't know if this is real, or satire

Sunday, September 09, 2007

A biological basis for the Golden Rule?

For years I've grappled with a dilemma which is really only relevant to people who want to argue with Objectivists. Basically once you claim to live by a selfish code of ethics, what prevents you from violating the rights of others for personal gain, in situations where you know you will not be at risk?

David Friedman calls this the Prudent Predator dilemma, and he's been twisting up Randroids with it for 40 years. He even got me with it, back in the day, when I was basically trying to re-write Objectivism into something that made sense.

Ethics, empathy, and primate brain holdovers

I wish I knew of this argument a decade ago when a few of my friends fell heavily into Objectivism. In trying to understand them, I even read all zillion pages of Atlas Shrugged (neat story, but she could have used an editor).

The entry also points out some recent research showing that there may be a biological component to society. Read, as they say, the whole thing.


A clear definition of the Y combinator in 28 words

There are some corners of Computer Science that are so esoteric that descriptions of them don't make sense on first reading (or even the hundredth reading)—for example, the Y combinator. That's why I always treasure crystal clear explanations like this one:

If you have a language that supports anonymous lambda functions, there may be occasions where you want to build a recursive lambda. The Y Combinator specifically enables this.

If you don't like the Y Combinator for some reason, you can always make sure that your recursive functions are always named, but it's often convenient to declare lambdas in place instead of naming them, and this is no less true of recursive lambdas.

Y Combinator in Python

(Don't let the term “anonymous lambda function” trip you up—all that means is a function (or subroutine if you will) that doesn't have a name)

I also find it amusing that while Lisp (generally considered one of the highest programming languages in existence) requires the Y combinator (if that link is too obtuse, how about a page about Lamba Calculus? Okay, that's obtuse as well), Forth, a language originally developed to control radio telescopes and has a history of being rather low level, doesn't need the Y combinator! You can write anonymous lambda functions all day in Forth with no problem (yes, I'm easily amused).

Monday, September 10, 2007

Notes and stats on a graylist experiment

I started seeing replies to an email a friend sent (he sent it to a bunch of friends, who started replying to all) way before I got the original email my friend sent. When I checked, it was as I feared, a large company (Adelphia) had multiple machines for outoing mail, and each attempt was coming from a different IP address, and coming too quickly to pass through the embargo timeout. For a while, I was actually afraid it would never make it through. When I did finally get it, some 9½ hours had passed from the first attempt:

Sep 10 08:06:55 brevard graylist: tuple: [68.168.78.202 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 08:58:00 brevard graylist: tuple: [68.168.78.187 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 09:53:08 brevard graylist: tuple: [68.168.78.178 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 09:53:35 brevard graylist: tuple: [68.168.78.178 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 09:53:59 brevard graylist: tuple: [68.168.78.178 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 09:54:17 brevard graylist: tuple: [68.168.78.178 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 09:54:30 brevard graylist: tuple: [68.168.78.178 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 09:54:38 brevard graylist: tuple: [68.168.78.178 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 10:49:24 brevard graylist: tuple: [68.168.78.205 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 11:50:29 brevard graylist: tuple: [68.168.78.211 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 13:01:35 brevard graylist: tuple: [68.168.78.175 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:06:15 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:06:20 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:06:29 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:06:52 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:07:14 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:07:34 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:08:07 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:08:24 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:08:33 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 14:08:41 brevard graylist: tuple: [68.168.78.181 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 15:12:39 brevard graylist: tuple: [68.168.78.44 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 16:17:17 brevard graylist: tuple: [68.168.78.196 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 16:17:23 brevard graylist: tuple: [68.168.78.196 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 16:17:45 brevard graylist: tuple: [68.168.78.196 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 16:17:53 brevard graylist: tuple: [68.168.78.196 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 16:17:59 brevard graylist: tuple: [68.168.78.196 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 16:18:06 brevard graylist: tuple: [68.168.78.196 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 16:18:51 brevard graylist: tuple: [68.168.78.196 , XXXXXXXX@adelphia.net , sean@conman.org]
Sep 10 17:20:50 brevard graylist: tuple: [68.168.78.178 , XXXXXXXX@adelphia.net , sean@conman.org]

It's this behavior that has us at The Office concerned about greylisting; that delays of this magnitude will have our customers screaming at us. I've been keeping track of such emails, building up a list of IP addresses to immediately whitelist. P asked if the given IPs were listed as the MX record, and if so, use that to whitelist the email. But when I checked, that wasn't the case for Adelphia. P then suggested I check the SPF records.

Not a bad idea. The SPF record for Adelphia matched the IPs I was seeing. I then went on to check the SPF record for some of the other companies I was whitelisting, like AOL and BellSouth. Sure enough, most (Yahoo is the only exception so far) have SPF records. I may have to include an SPF check into the daemon, but I'd rather not immediately let through emails that pass the SPF check. I'll have to think about how I want to do this.

Meanwhile, some stats from the currently running version (started sometime last week):

Current Graylist statistics
tuples 1,810
graylisted 20,775
whitelisted 42
graylist expired 18,965
whitelist expired 0

The row labeled “tuples” are all the tuples currently in memory (that haven't expired) and the row labeled “graylisted” have been the number of emails added to the graylist since the program started. It's been holding steady at about 1,800 tuples at any one time for the past few days (and this is just the emails being sent to my server—perhaps a dozen domains or so, but mostly to conman.org). So far, only 0.2% of all emails have been whitelisted, but that includes 18 spams. Not that bad considering prior to this I was getting something like 1,800 per day.


I can see this as being as sharp as a sword

[Somehow, I get the feeling that this wouldn't be much of a match against a sword]

Despite some mistakes, the calligraphy I've done so far has been good enough that I was given a glass pen in appreciation.

Dubious about it actually being a pen, I tried it. It worked. It holds on to enough ink to get a few words down before needing to reload it. It's perhaps the coolest pen I've ever owned.

Tuesday, September 11, 2007

Note on a greylist implementation

For such a simple concept, greylisting has a lot of pitfalls. I managed purely by chance to see that Mark had sent me an email (I saw the tuple in the log files). Curious to see how long it took to be accepted, I was horrified to see that not only had it not been accepted by the greylist daemon, but that it had been kicking around the system for over 30 hours!

Like clockwork, Mark's email server was attempting to send the message every thirty minutes, on the dot, and thus, was never getting through the embargo time out. It all came down to this one piece of code:

if (difftime(req->now,stored->atime) < c_timeout_embargo)
{
  stored->atime = req->now;
  send_reply(req,CMD_GRAYLIST_RESP,GRAYLIST_LATER);
  return;
}

If the last access time was less than the embargo timeout, update the access time and send back “try again later.” At the time I found this out, I simply added Mark's server IP to the whitelist and restarted the greylist daemon.

Later, at the weekly Company meeting, I mentioned some of the issues I've had over the week and after some discussion, I made two changes to the greylist daemon:

  1. cut the embargo timeout from one hour to 25 minutes
  2. use only the sender and recipient in the tuple, dropping the IP address (or rather, ignoring it)

To test these changes, I also removed a bunch of the whitelisted IP addresses, to test the effectiveness.

They weren't all that effective.

I had problems with BellSouth, trying to deliver an email for four hours (and, as always, well below the embargo threshhold). I restarted the greylist daemon with an extended whitelist of IP addresses.

In reading many pages on greylisting, I realized I may have mis-interpreted the original whitepaper:

With this data, we simply follow a basic rule, which is:

If we have never seen this triplet before, then refuse this delivery and any others that may come within a certain period of time with a temporary failure.

So instead of checking against the last access time, I should compare against the creation time of the record.

Off to make that change and see how it goes.


Constrained writing demoed via email exchange awhile ago

Cadaeic Cadenza (link via Jason Kottke) is a “constrained writing,” where the constraint is each word has the number of letters corresponding to a digit in π (first word has three letters, second word has one letter, third word has four letters, and so on).

It reminded me of an email exchange I had with my friend Hoade, wherein each email was constrained in some manner.

It starts with a reply to Hoade:

From
Sean Conner <spc@pineal.math.fau.edu>
To
Sean Hoade <shoade@sun1.iusb.edu>
Subject
Re: Scooby Dooby Doo, Where Are You?
Date
Mon, 19 Jun 95 22:56:45 EDT

A long long time ago in a network far far away, The Great Sean Hoade wrote:

Conman—

Hello. Hope you are well. I am fine. Good. Help. I can just make one-syllable words, 'cept for the word “syllable.”

Will try to keep this small. Can I count on you to come down here two months from now? I look towards the day you are here. And I think I can help you with your word choice. Too bad you had to use a long word there. It is not hard to avoid those words. See?

From
Sean Hoade <shoade@sun1.iusb.edu>
To
Sean Conner <spc@pineal.math.fau.edu>
Subject
cool
Date
Tue, 20 Jun 1995 17:27:19 -0500 (EST)

Harder, methinks, writing choices using only two-time counting …

From
Sean Conner <spc@pineal.math.fau.edu>
To
Sean Hoade <shoade@sun1.iusb.edu>
Subject
Re: cool
Date
Tue, 20 Jun 95 18:18:38 EDT

Truthful wisdom indeed. Thinking dual phonems (spelling?) isn't easy. Although practice ensures success. Agree?

(And yes, I still have problems speling)

From
Sean Hoade <shoade@sun1.iusb.edu>
To
Sean Conner <spc@pineal.math.fau.edu>
Subject
Re: cool
Date
Tue, 20 Jun 1995 20:06:21 -0500 (EST)

Sentence fragments? Forswear fragments, Conner! (Asshole …)

From
Sean Hoade <shoade@sun1.iusb.edu>
To
Sean Conner <spc@pineal.math.fau.edu>
Subject
Re: cool
Date
Wed, 21 Jun 1995 18:44:25 -0500 (EST)

Attempting tripartite syllabic collections challenges heartily, Connerman.

Personal opinion: redundant repeating selections crucify attempters thoroughly.

The constrained emails died down for a few days, until this exchange:

From
Sean Hoade <shoade@sun1.iusb.edu>
To
Sean Conner <spc@pineal.math.fau.edu>
Subject
Four syllables?
Date
Mon, 26 Jun 1995 23:22:30 -0500 (EST)

Enigmatic, mysterious communiques– incredible electronic correspondence!—solidify, obviously computerized benedictions Hoade-to-Conner.

Whaddyathink?

Hoade

P.S.—Aquarium!

From
Sean Conner <spc@pineal.math.fau.edu>
To
Sean Hoade <shoade@sun1.iusb.edu>
Subject
Re: Four syllables?
Date
Mon, 26 Jun 95 22:51:18 EDT

Incredible! Spectacular! Impressively unspeakable phenomenon! Untoppably quisicential!

Did I mention that I kant spel?

Anyway, a few months go by, and our last exchange of contrained emails:

From
Sean Hoade <shoade@sun1.iusb.edu>
To
Sean Conner <spc@pineal.math.fau.edu>
Subject
Re: Perfectly Prosaic Prose
Date
Sat, 27 Jan 1996 07:54:21 -0500 (EST)

And because computers drain everyone's future—good! Have I just kinda loose morals? No! Only pious, questioning rabbis say this (until vacuous women—Xanax—yawn zestlessly).

From
Sean Conner <spc@pineal.math.fau.edu>
To
Sean Hoade <shoade@sun1.iusb.edu>
Subject
Re: Perfectly Prosaic Prose
Date
Sun, 28 Jan 96 3:33:25 EST

Zounds! Your xiphoid women vanguard unilaterally try soldierly responses quickly. Personally, one needs many large Kaffirian juggernauts, instigating hectoring grandiose fanaticism. Egads! Damnation! Cabalists be aware!

Yeah, you try writing a twenty-six word paragraph in reverse alphabetical order and see how easy it is.

Granted, what we did wasn't as difficult as writing a work with letter counts based on π or a book without the letter “E” but that doesn't mean it was easy.

Wednesday, September 12, 2007

Quick links

Those who are into Lego or a fan of Stephen Hawking might like this little Lego sculpture of the man.

And for Mark, how about a secret office beer refrigerator?

Thursday, September 13, 2007

I suppose living in Taumatawhakatangihangakoauauotamateapokaiwhenuakitanatahu would be worse than living in Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch

Why is this the worst place name in the world? In Maori, the native language of New Zealand, the “wh” sound is pronounced “f”. Say it aloud in your office and see what happens.

The 22 Worst Place Names in the World

Courtesy of Mark comes this list of badly named places in the world. Number 8 is where my parents got married. And what place is number one?

You'll just have to read and find out …

Friday, September 14, 2007

Yet more notes on a Greylist implementation

It was bad enough getting up early this morning to cover the phones (Smirk and P were heading out of the area for several meetings) but to wake up to a customer (who had gotten my cell phone number when I called him yesterday) complaining about bandwidth issues (and yes, their 100Mbps connection is slower than a 56Kbps modem) made it all the worse.

After dealing with that issue (turned out to be a problem with The Monopolistic Phone Company, but it took several hours to diagnose that problem) I turned to what I had originally planned on doing today, working on the greylist daemon.

I managed to fix the problem with fork(). The code I used for this daemon I borrowed from a previous daemon, which set each open file to be closed when calling exec(). I removed that code, it worked on the server. I'm not calling exec() (I am calling fork(), but I don't know why marking files to be closed on exec() would have an ill effect, but it did, so it went).

Go figure.

I also wrote an interesting frontend to the daemon, which is called gld_mcp (short for “Graylist Daemon Master Control Program”). Prior to this, I had to send a variety of signals (as root—otherwise I don't have the appropriate permissions), and check the system log files to get any information out of the daemon. Now, I can do:

gld-mcp>show stats

Start:             Fri Sep 14 20:58:16 2007 
End:               Fri Sep 14 21:10:37 2007 
Running time:      12m 21s
Tuples:            33
IPs:               46
Graylisted:        14
Whitelisted:       19
Graylist-Expired:  0
Whitelist-Expired: 0

gld-mcp>

without having to be root or grovelling through system log files. (By the way, the IPs: field is the number of entries in the IP whitelist; any email coming from an IP address that matches an entry in this table is automatically let through)

Since I changed the program to check the creation time instead of the last access time, only a few more spams have gotten through, but the issue of maybe never getting a legitimate email has gone away, which is good.

And it wasn't a totally bad day—at least the phones were quiet.

Monday, September 17, 2007

Mea culpa

Today felt like a Day Ten. Went to bed around 5:30 am (which is my usual bedtime) only to get a call at 10:30 am from Smirk about a problem, and had to drive to Boca Raton to get it resolved (normally, I get to work from home).

Looking back at it, I've had worse days. It wasn't like I was dealing with obsolete computers from Hell, or on a business trip from Hell, or even dealing with Russian hackers from Hell breaking into a server during a hurricane. Nope, nothing quite as bad as any of that.

But it was my fault for getting that call after five hours of sleep at the ungodly hour of 10:30 am, and because it was my fault, that made it all the worse. To make me feel even lower, the issue itself was resolved rather quickly (instead of being handled on Friday, when I said I would handle it—sigh).

Tuesday, September 18, 2007

“So think twice before you assume … ”

I don't believe this.

How can someone with otherwise perfectly normal hearing not know their notes?

If you can do X, why can't you do Y? (this is actually a quote from the video on the page)

Years ago I was hanging out with my friend Eve when the conversation turned towards Microsoft Office and her wanting to learn how to use the program more effectively.

“I need to see if there's a class I can take,” she said.

“Class?” I said. “Just sit down and play around with the program.” It seemed a perfectly reasonable approach to me. Why waste money or time on a class?

“I can't do that,” said Eve. “I won't learn.”

“What?” That was just silliness. I have never heard of such a thing. “Can't you just sit down and do it?”

“I don't know how to do it, that's why I need the class.”

“How hard is it to load up Microsoft Word and start playing with it?”

“When will I find the time?”

“Yet you'll make the time for a class?”

“Yes. Besides, I learn best when someone tells me how,” she said.

It was my first real experience with different learning methods. I had a hard time fathoming that an otherwise intelligent person (who could program computers) couldn't learn on their own. Heck, that's how I learned most of what I know, and people have always told me I was intelligent. QED anyone intelligent can teach themselves.

What I've learned since is that not everyone can teach themselves. And that there are more learning styles than just self-teaching and lectures. A lot more. And furthermore, I learned that just because I do something one way doesn't mean that other people do something the same way.

The video demonstrates that point succinctly.

Wednesday, September 19, 2007

“Things happen at sea.”

Oh, is that today? I guess so.

I think that Eric Burns had the best take on talking like a pirate.

Arg.

Thursday, September 20, 2007

What next? Checking your Blackberry while bungee jumping?

What?

Living in a van
And driving it all across the country

Where?

Down by the River
Many rivers, actually. And lakes and mountains and ice cream stands.

Why?

To get away, see America, find the best city to live in, meet cool people, and blog about it all.

Via Instapundit, In a van down by the river

While I telecommute to work, and I like telecommuting, I think this is taking telecommuting a bit too far (then again, I'm not one for camping).

And they're not even the first ones to do this.

Friday, September 21, 2007

What the heck do they put in the water in Boston?

BOSTON—Troopers arrested an MIT student at gunpoint Friday after she walked into Logan International Airport wearing a computer circuit board and wiring on her sweatshirt. Authorities call it a fake bomb; she called it art.

Via Flutterby, MIT student charged with wearing fake bomb she says was only art

And upon further reading, she wasn't wearing the circuit board to make a statement about airport security theatrics, but to gain attention at a career day at MIT, and was only at the airport to pick up her boyfriend.

Just below the masthead up there, I've written “The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal ‘The Boston Diaries.’” At the time, I didn't like Boston because it's old (you really have to look to find anything in South Florida build prior to the 1960s, and as a result, most every building down here is “modern” in that it was initially designed with bathrooms, decent electrical and more importantly—air conditioning!), cold (water freezes down here at 40°F) and a twisty maze of one-way roads all alike (not to mention the drivers—think we have bad drivers? Hah!).

But this crap? This, along with the Mooninite scare, the Traffic Counter scare, mandatory health insurance (and if you don't have it, you pay even more in taxes) and hypocritical liberal weenies, has dropped my opinion of Boston even further.


Software performance with large sets of data

I have seen this many, many times. Something that runs fast during development and maybe even testing because the data used in testing was too small and didn't match real world conditions. If you are working on a small set of data, everything is fast, even slow things.

Software performance with large sets of data

The primary test I use with greylist daemon is (I think) brutal. I have a list of 27,155 tuples (real tuples, logged from my own SMTP server) of which 25,261 are unique. When I run the greylist daemon, I use an embargo timeout of one second (to ensure a significant number of tuples make it to the whitelist), a greylist timeout of at least two minutes, with the cleanup code (which checks for expired records and removes them) running every minute. Then to run the actual test, I pump the tuple list through a small program that reformats the tuples that the Postfix module expects, which then connects to the daemon. There is no delay in the sending of these tuples— we're talking thousands of tuples per minute, for several minutes, being pumped through the greylist daemon.

Now, there are several lists the tuple is compared against. I have a list of IP addresses that will cause the daemon to accept or reject the tuple. I can check the sender email address or sender domain, and the recipient email address or domain. If it passes all those, then I check the actual tuple list. The IP list is a trie (nice for searching through IP blocks). The other lists are all sorted arrays, using a custom binary search to help with inserting new records.

Any request that the server can handle immediately (say, checking a tuple, or returning the current config or statistics to the Greylist Daemon Master Control Program) are done in the main processing loop; for longer operations (like sending back a list of tuples to the Master Control Program) it calls fork() and the child process handles the request.

I haven't actually profiled the program, but at this point, I haven't had a need to. It doesn't drop a request, even when I run the same grueling test on a 150MHz PC).

I just might though … it would be interesting to see the results.

Saturday, September 22, 2007

Software performance after a few hours with large sets of data

As a corollary to yesterday's entry about testing—make sure you test for several hours.

I found a bug in the latest version of the greylist daemon that only manifests itself after about six hours of running. For some as yet unknown reason, the program just stops responding. It doesn't segfault (if it did, it would automatically restart). It just doesn't quit (if it did, I wouldn't see it running in the process list). It just gets into a weird state. When I attach gdb to the running instance the stack frame is somewhere in the weeds (that's a technical term) so its hard to isolate the problem.

This type of bug is very difficult to diagnose.

Although I do have an idea of what it might be. The latest feature (as a request by Smirk) is to checkpoint the program every hour or so—it dumps its internal state so it can pick up again when it restarts. When I checked the logs, the last two times it crashed (after running for about six hours) it was just as it was checkpointing itself (which is logged).

I removed the checkpoint feature from the “production” version, and hopefully, I won't get another influx of spam in six hours (the Postfix module accepts the incoming email if it doesn't get a response from the greyist daemon after five seconds—I figure a) it's better to receive spam than lose email and b) getting a ton of spam is a clear indication something is wrong).

Meanwhile, I'm running the grueling test slowly (one tuple per second), with the hopes of triggering (or at least, reproducing the problem) in six hours.

Sunday, September 23, 2007

Heisenbugs

That weird bug that shows up after a few hours is proving hard to find. I've been running the greylist daemon for over a day now on my development server and have yet to reproduce the issue.

But it does relate to the checkpoint feature, because since I disabled that particular feature on the production server, it hasn't stop working.

Very strange indeed.


I think this was already done by John Titor

Guys, it's time for

Pretend to be a Time Traveler Day

You must spend the entire day in costume and character. The only rule is that you cannot actually tell anyone that you are a time traveler. Other than that, anything's game.

Via Les Orchard, Pretend to be a Time Traveler Day

In this instance, if everyone does it on December 8th, 2007, then it loses its significance, especially in light of John Titor, who showed us all how to pretend to be a time traveler way back in 2000.

Monday, September 24, 2007

It's really been over ten years since I wrote “An Extended Standard for Robot Exclusion”? Wow …

Around the same time the IETF draft was being discussed, Sean “Captain Napalm” Connor [sic] proposed his own extension to the Robots Exclusion Protocol, which included Allow rules as well as regular expression syntax for rules, and new Robot-version, Visit-time, Request-rate, and Comment rules. Less than 100 of the sites I visited use rules unique to this spec.

Via email from Steve Smith, robots.txt Adventure

[As a small aside, I don't know why people insist on spelling my last name with an “O-R” instead of an “E-R”. It's not like I misppelled my own name on that page. Sigh. —Editor]

That's not the only place “An Extended Standard for Robot Exclusion” has been referenced—it's also mentioned in O'Reilly's HTTP: The Definitive Guide, but until Steve reminded me of it, I basically forgot about it. Understandable since the last time it was edited was November of 2002 (and even then, the previous time it was edited was six years earlier—it's old).

This probably means it's time once again to check the links and make sure they all work.

And maybe clean up the HTML while I'm at it.

Just as soon as I can reproduce that insipid Heisenbug.


Probing further into Alien Abductions

Today, this very day, forty-six years ago, Betty and Barney Hill drove down U.S. 3, right past my house and into history. They were about to become Patient Zero for Alien Abductions with Weird Medical Experiments, Missing Time, and Big-Eyed Extraterrestrials. The first and (we are told) best documented case of Alien Abduction Evah. There was a book. There was a made-for-TV movie. Magazine articles. Mentions in other books. Cl ose Encounters of the Third Kind. X-Files.

So what happened out on Route 3?

Via columbina, Alien Abduction

I'm guessing that this debunking article won't go over very well in Rachel, Nevada. It's also long, but well worth reading since everything about the Betty and Barney Hill abduction is debunked.

I also found it helped to use Google Maps to help follow along on that fateful roadtrip forty-six years ago.


Heisenbugs, II

I spent several days testing the greylist daemaon on the development server, and could not for the life of me reproduce the crash. I cleaned up the code a bit, and again, I couldn't get the program to crash on the development server.

Moved the latest version to the production server, with the checkpoint feature enabled, and after a few hours (about 4½ hours this time) it froze.

Disable the checkpoint feature, and it runs fine on the production server.

I'm giving up on this bug hunt for now. The program saves its state when it stops running—the only way we'll lose the state is if the server it's running on suddenly loses power, and if that's the case, then we have more issues to worry about.


Our wall has been framed!

I suddenly heard a large amount of pounding. I went out to find not The Kids making all that noise, but Spring. She was busy hanging up a metric buttload of pictures in the living room.

[My God!  It's full of frames!]

The whole wall was covered in pictures and paintings (well, one painting). I like the effect—it makes the living room look taller. (For those who are curious, most of the pictures in the brown frames are pictures I took, along with the alligator and the Everglades, which are on either side of the large image, which is an Ansel Adams print. The blue painting was given to The Kids by an aquaintence of ours. At the very far end are pictures of The Kids and a few certificates)

Tuesday, September 25, 2007

Abstraction for its own sake

This is a rather small and unfocused rant on unnecessary abstractions. It may be related to this wiki page on abstraction not scaling; it may not be (told you this was unfocused).

I'm in the process of writing the Sendmail interface for the greylist daemon, and as far as that goes, it's very simple for this application—it just requires the writing of four functions, one that's given the IP address of the remote SMTP server, one that's given the sender email address, one that's given the recipient email address, and one to clean up any resources allocated from the other functions.

Like I said, pretty easy and straightforward.

The oddness comes in the first function's definition. It's:

sfsistat xxfi_connect(SMFICTX *ctx,char *hostname,_SOCK_ADDR *hostaddr);

The _SOCK_ADDR variable contains the IP address of the remote SMTP server. What is it? I dunno. What does it contain? I dunno. All I care about is the actual IP address, and I presume it's somewhere in this variable (most likely a structure of some kind). But there's no further API for pulling any useful information out of this _SOCK_ADDR.

Well, let's dive into the header files:

#ifndef _SOCK_ADDR
# define _SOCK_ADDR     struct sockaddr
#endif /* ! _SOCK_ADDR */ 

For some reason, the writers of Sendmail felt it prudent to hide the definition of the IP address behind another name. Presumably to make their code more portable to other non-BSD-derived network stacks that don't use struct sockaddr to store IP addresses.

But that does my code no good. I don't care about the hostname, I care about the IP address, yet without further information, I don't know exactly what I have. Is it the binary form of the address? How big? (there's IPv4, in which each address is 32 bits long, and IPv6, in which each address is 128 bits long) I had to dive into the header files to find out what exactly I'm given.

And it's not like the above definition of _SOCK_ADDR is wrapped around other #ifdef's checking for various operating systems or network stacks. The above definition is it! That's all there is.

So I really have to wonder why they felt this further abstraction for their code was necessary. It seems to be abstraction for abstraction's sake. Here's a hint: you aren't going to need it.


Reason #98,333,323 why I hate control panels

[The following is a trouble ticket submitted by me into the Company Internal Trouble Ticket Queue System™. It should be of no real surprise to anyone here, but it still makes for decent blog fodder.]

I have the graylist sendmail module written, so I go to install it for the XXXXXXXXXXXXX domain. The initial module does nothing but log the requests and let the email through—I did the same for the postfix module to ensure that the module worked before hooking it up to the greylist daemon.

Anyway, I'm going to install the sendmail module for XXXXXXXXXXXXX. XXXXXXXXXXXXX is on XXXXXXXXXXXXXXXXXX, which is a box managed by Insipid. Now, the instructions for installing the sendmail module are easy—just modify /etc/mail/sendmail.mc with the following lines:

define(`_FFR_MILTER', `1')dnl.
INPUT_MAIL_FILTER(`filter1', `S=unix:/tmp/milter')

And run make in /etc/mail to generate the new sendmail.cf file. Fortunately for us, I made a backup copy of both the sendmail.cf and sendmail.mc files. So I do thusly. And then I compare the new sendmail.cf file with the one supplied by Insipid.

[root@XXXXXX ~]# cd /etc/mail
[root@XXXXXX mail]# diff -y sendmail.cf sendmail.cf.milter

(if you run that, make sure your terminal window is W I D E, since it shows both versions side-by-side)

Lots o' differences.

And while I can merge the two, will an upgrade of Insipid break sendmail.cf? What about our warranty? Or are we forever doomed to manually patch the sendmail.cf file?

Wednesday, September 26, 2007

“You mean there's someone just as good as Carl Barks?”

I grew up reading Uncle Scrooge, but by the time Don Rosa started drawing the comic, I had stopped reading comic books in general. And while I had heard of Don Rosa as an Uncle Scrooge artist, I had no idea how good he was, in both art and story.

I might have to scare up a few copies of his work.


How much money can I get with my vote?

A democracy cannot exist as a permanent form of government. It can only exist until the voters discover that they can vote themselves money from the public treasure. From that moment on, the majority always votes for the candidates promising the most money from the public treasury, with the result that democracy always collapses over loose fiscal policy followed by a dictatorship.

Alexander Tyler (presumedly)

There are two ways out of this: inflation or deflation. Either home prices drop until they return to their historical relationship to wages, or the price of everything except houses goes up as the Federal Reserve and Congress bail out “homeowners.” The Federal Reserve can do it with its current chief's money-dropping helicopters, or Congress can do it by using our tax money to pay off the bad debts of investment banks while pretending to “bail out homeowners who will lose their homes!” Either way, we taxpayers lose and banks win.

The coming mortgage bailout

Hmmm … democracy … or fiscal irresponsibility?

Yeah … democracy is overrated anyway.

Charge it! I won't have to pay for it! Woot!

Thursday, September 27, 2007

Toccata and Fugue in D Minor

For Bunny: a neat way to see Johann Sebastian Bach's Toccata and Fugue in D Minor.

Yup. See.

As well as hear.

It reminded me of the last half hour or so of Close Encounters of the Third Kind, only without the aliens and French directors running about.

Friday, September 28, 2007

And by the way, I agree that we shouldn't beg the government for a handout

[In a long back-and-forth exchange between Gregory and myself, I posted the following, which I'm posting here as well.]

There were two comments to How big a problem is lack of health insurance? that really stood out for me. The first one:

A few points. I lived under socialized medicine in England and it is no panacea. The rationing of medical care is ridiculous and they keep ever increasing amounts of money into a system that is failing. Moore extols virtues and buries the vices of the system.

As to insurance, we really have no agreement about what is health insurance and what should it provide. Is it to protect people financially from catastrophic illnesses? Provide all medical care no matter the illness? Cover only those illnesses/services you want from a menu of options? For instance, one would think a gay man isn't interested in maternity care for him (and his partner).

Until we agree on the terms, this debate will be divisive and solve little. But politicians will continue bloviate regardless through this election cycle.

And the second one:

I found this comment hilarious.

“Not just one, but every other industrialized country manages to cover everyone, including all children, while having much lower costs and, by most measures, better health outcomes and more preventative care. It's no big mystery.”

The British system recently admitted to rationing of care. Many people have been hurt by that. Some have even died.

In Canada, diagnostic tests are often delayed for months. In some cases this means that the patient will die before getting properly diagnosed.

The reason why other systems have lower costs for drugs is because their governments mandate those costs and Americans subsidize them by paying higher prices for the same drugs.

I think we need to start questioning why health care is so expensive. And not just here, but in the West in general (why else would England ration healthcare?).

Where's the money going? I'll be looking at three aspects of health care in an attempt to figure out why it's so bloody expensive (pardon the pun).

The Student,
The Resident
and the Doctor

I start, with the doctors.

Becoming a doctor requires at least a decade of schooling. Four years of pre-med, four years of medical school, one year of internship, and three years of being a resident. And eight years of schooling isn't cheap. Anywhere from $100,000 to $240,000 total (reference, reference). Student loans seem to average about 10% interest rate (reference) so it's basically similar in cost to a mortgage (but without the benefit of selling it to someone else if things go rough). So your average doctor is facing nearly $200,000 of debt just for the honor of being able to say “cough for me please … okay, take these pills and call me in the morning.”

That in itself limits the number of doctors available, and if you remember Economics 101, if the supply remains constant yet the demand goes up, so does the price. And in this case, the supply is restricted.

Salaries for doctors appear to range $112,000 to $360,000 per year, depending upon speciality and skill, with an average that seems to be around $230,000/year (although I didn't actually calculate that, it seems right).

So our doctor now makes $230,000 a year (I suspect not for a newly graduated doctor, but hey, let's be generous). Assuming an average of $200,000 in student loans at 10%, being paid back over 30 years (and I have no idea if this is realistic, but it's a ballpark figure) that means the doctor is paying around $1,760/month in student payments. That works out to 11% of his income. His income tax bracket is 33%, so now we're up to 44% of his income is already spoken for before he can pay his mortgage.

Oh, then there's medical malpractice insurance.

Some quick searches (reference, reference) seem to place medical malpractice premiums between $15,000 and $18,400 a year. Split the difference and we get $16,700 a year (and probably way higher for obstetricians) which is another 7% of his income removed (we're now up to 51% of our doctor's salary gone before he can use it).

No wonder doctors make five figures a year—otherwise, they can't afford to be doctors!

The Man,
The Plan,
The HMO

Assuming you have health coverage, you don't pay for it.

Oh, you may pay some token amount like $20/visit, but otherwise, who cares what the price is? You're not paying it, your insurance is.

You don't select doctors, clinics or hospitals on the basis of price, but on the primary basis of “is he/she/it listed in the little black book of allowable doctors/clinics/hospitals,” secondarily on “which one is closer” and lastly on “which doctor/clinic/hospital on that list do I like the most/hate the least?”

Notice that price isn't among the factors here.

Sure, you'll get a bill and see the price the doctor/clinic/hospital charged the insurance company and you'll go “Whew, glad I have insurance to cover that!” And you may have to bitch argue a bit with the insurance company to actually pay that amount.

But generally, price isn't a primary consideration.

Until you don't have insurance, but that's what we're arguing about here.

I digress.

If you were handed a bill for $225 for a few ankle X-rays (reference) and had to pay it out of pocket, you might really start questioning the doctor for his outrageous fees; considering that you used to get foot X-rays for free at shoe stores, you might just tell the doctor where they can shove those ankle X-rays. At the very least, you probably won't be going back to that doctor any time soon. And possibly most people. The doctor may then have to rethink the $225 price, and drop it to get customers back in the door.

The insurance company, on the other hand, just shrugs and pays the $225.

But why does an ankle X-ray cost $225? Part of it is because that's what the insurance companies can bear to pay. Another part is the administrative overhead the doctors/clinics/hospitals have to go through in order to get money from said insurance companies (reference, reference). A doctor will have to have extra staff (even if it's just one overworked person) to handle just the paperwork.

Again, these costs are passed on down.

The insurance company just shrugs and pays.

And if the insurance company will pay $225 for an X-ray, why not $300?

We shrug and go “whew, glad I have insurance to cover that!”

The insurance company shrugs and pays.

And in the end, we pay.

The Pharms,
The FDA
And the jagged little pill

It's been claimed that pharmaceutical companies make too much money at the expense of the sick, and while there may be some truth to that, it's also true that it's hideously expensive to bring new drugs to market. And a major portion of that is the testing required by the FDA (reference, reference, reference, reference, reference, reference, reference, reference). More clinical tests mean more money spent, and more time spent, before a pharmaceutical company can market and hopefully recoup its development costs. And forget profits, just breaking even is difficult enough. And you might think it would be easy for a pharmaceutical to recoup its development costs in twenty years (the length of a patent giving exclusive rights to the patent owners), but I'm guessing it depends upon how popular it is (or how common the ailment it cures or prevents).

It also doesn't help that other countries outside the US set caps to what the pharmaceutical companies can charge, and it becomes even harder to recoup the development costs (and this gets back to the comment above, where the US subsidizes the cost of medicine elsewhere by charging more domestically).

The Uninsured,
The Illegals
And justice for all.

An emergency room cannot turn anyone away, therefore, those who can't get insurance for whatever reason often head there to get treatment. And since most can't pay, the hospital has to subsidize the costs of the emergency room with higher prices elsewhere.

So there you go. It's expensive because it's expensive to have doctors. It's expensive because we aren't price sensitive. It's expensive because drug testing is expensive. It's expensive because a lot of people don't pay. It's expensive because of half a dozen other reasons I forgot to mention.

Is there a fix?

I suspect there is, but it's probably one that nobody will like.

Saturday, September 29, 2007

Revisions of versions

I spent today making sure I was using the lateset version of the greylist daemon across two production servers (one running Postfix; the other Sendmail) and the development server (or rather, servers as I actively develop on two different machines).

Good thing too, because there was one file that was completely different on each machine, and quite a few files that were the same on two or three and different on the remaining machines.

You might ask why I'm not using some form of revision control, but I am. Only I don't think it's right for how I work. I could use CVS, since it comes already installed on Linux distributions, and I'm currently using it for mod_blog. But it does have a few problems. Then there's subversion, but the last time I checked, it was a bear to install, and I'm not sure how one actually goes about creating a new project with it.

No, for this project, I decided to try git. Dead simple to install. Dead simple to create a new repository. Dead simple to create and checkout different branches. Incredibly fast too. But managing a central repository with git seems to have eluded me. Perhaps it's because I don't fully understand how git is used properly, or perhaps it's because I want a centalized development model, not a decentralized one (even though I do development across different machines). And the pulling or pushing of changes from one repository to another doesn't seem to work as I would expect it to work.

It's a pity, because other than “checking in” and “checking out” revisions, it's a nice, fast program.

Sunday, September 30, 2007

I would still like to try this at some point

Every other Sunday I get together with a group of friends for a day of RPGing. This week three of our seven member gang bowed out. Jeff, the current GM mumbled something about running a one-shot in a different system. I was hoping to get the chance to try out Risus: The Anything RPG, because at six pages (and that's including the optional advanced rules) it seems perfect for those occasional one-shots.

Alas, at the last minute, Gregory bowed out and with so few people left to get together, today's game just fell apart.

Ah well.


The Overnight Millionaire

Bunny recently received an 88 page booklet in the mail called The Overnight Millionaire by Russ Dalbey, which outlines three easy steps to making money hand-over-fist.

I scanned through the 88 page booklet. It's not until page 13 that Mr. Dalbey mentions what the scam plan is based upon—the “Cash Flow Note” business. It's basically matching sellers of “notes” (read: mortgage), who're currently receiving monthly payments (say, on a house) but want to cash out, to buyers of “notes,” investors who want a steady stream of monthly income.

So the “three easy steps” are (starting on page 37):

  1. Find note sellers.
  2. Find note buyers.
  3. Introduce the two, getting a cut of the action.

Easy.

But then again, so is writing a metasearch engine.

Amusingly enough, as of today, the top result in Google for “The Overnight Millionaire Russ Dalbey” is Russ Dalbey—Winning in the Cash Flow Business Complaints.

Heh.

It's amazing that in the age of Google scam artists can remain in business.

Obligatory Picture

Dad was resigned to the fact that I was, indeed, a landlubber, and turned the boat around yet again …

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

No AI was used in the making of this site, unless otherwise noted.

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2025 by Sean Conner. All Rights Reserved.