The Boston Diaries

Saturday, September 08, 2007

Notes on an ideal integrated development system

I'm not a fan of IDEs. I grew up with the “edit-compile-run” cyle of development, and while I didn't always have a choice in the “compile” portion of things, I did in the “edit” portion, and over time became very picky about which editor I use. Because of that, whenever I did try an IDE, I invariably found the “edit” portion to be very painful, stuck in an editor that I wasn't used to; being forced to use an unfamiliar editor resulted in a vast loss of productivity and thus, I've never liked IDEs. So I stuck with the “edit-compile-run” cycle.

But the recent bout of programming I've done has made me wish for something better than the “edit-compile-run” cycle. And while IDEs have probably evolved since I last tried them in the late 80s, I don't think they've evolved enough to suit me.

What I'm about to describe is defintely “pie-in-the-sky” stuff. I'm not saying that IDEs must be this way—I'm just saying that this is what I would like in an IDE. Who knows? Maybe this won't work. Maybe it's unworkable. But I wouldn't mind seeing these features (at long as the editing could be configured to my liking).

A database I used in the early 80s that ran on a twin floppy PC. Written by Brian Berkowitz and Richard Ilson

Wonderful features were:

It stored user-defined names separately from internal IDs, so you could change the names of tables and fields without worrying.

Fantastic date handling—you could enter “Next Wednesday” and it would work out the date.

No Table/View separation, you could define a field on a table as being calculated on fields from a related table and that definition became part of the original table.

Cornerstone

The one feature of Corner stone that still strikes me as innovative is the separation of variables (or in this case, fields and tables) from their name. One could change the name of a variable without having to edit every other occurrence of that name. That's a very powerful feature, but to implement it in an IDE, that IDE would have to have intimate knowledge of the computer language being used.

A few years ago, I cleaned up the code in mod_blog. I had a bunch of global variables used throughout the codebase, all starting with “g_” (such as g_rssfile) but they weren't variables in the traditional sense, they were more or less “run-time settable constants” (to the rest of the codebase, the declaration for g_rssfile was extern const char *const g_rssfile). I decided that they needed a renaming to better reflect how I actually use them, and changed the majority of global variables to start with “c_”.

Talk about pain.

Each one required at minimum three edits—the declaration in a header file, the actual declaration, and the setting of said variable when the program starts up. If I had this feature, something that took maybe an hour could have been finished in a few minutes.

But mod_blog is a very small codebase—some 14,000 lines of code. Could such a feature scale to something like the Linux kernel? Or Firefox? Or even Windows Vista? I don't know. And how would you even implement something like that?

My guess—if you even hope to do something like this on multimillion line codebases, you may have to give up on storing the code as text and move on to some other internal format.

It's not like it's a new idea. Most forms of BASIC (you know, that horrible langauge made popular on 8-bit microprocessors of the 70s and 80s) were not stored as text but in a mixture of binary and text form (although you could get a pure text version of the code if you wanted it).

So, what happens if we get away from distinct text files? And hey, why not design (or redesign) a language while we're at it?

A common complaint about static typechecking languages is the requirement to declare all your variables. But if we're using an ideal IDE, one that understands the langauge we're programming in, why not take the work on type inference and use it during the editing phase?

Something like:

The editing takes place on the right-hand side, whereas the IDE will track your variables and types on the left-hand side. In this simple example, we see that the IDE has determined that the function nth() takes an integer, and returns a constant string.

In this example:

The IDE inferred that the function foo() will return either a constant string or a number, which is highlighted in red to indicate the conflict (not that it won't run depending upon the language—it's just highlighting the fact that this function will return one of two types). It also inferred that the parameters are of type “number” (doubles, floats, integers, what have you).

So, the IDE could be doing these types annotations for you, but why not the ability to further annotate the annotations? I don't see why you couldn't edit the left-hand side to, say, change the type the IDE detected, or even annotate further conditions:

Here, we annotated that b is not to be 0, and the IDE then highlighted the code to say “hey, this can't happen.” The assumption here is, the compiler can then use the annotations to statically check the code, and if it can determine at compile time that b is 0, then flag a compilation error—otherwise it can insert the runtime code for us to check and raise an exception (or do the equivilent of assert()) at runtime.

(And if we have all this syntax and typechecking stuff going on, along with the ability to change variable and function names at will without having to re-edit a bunch of code, we might as well have the IDE compile the code as we write it—although on a huge codebase this may be impractical—just a thought)

I'm still not entirely sure how to present the source code though. Since this “pie-in-the-sky” IDE stores the source code in some internal format, the minimum “working unit” isn't a file. I want to say that the minimum “working unit” is a function (that's how the examples are presented), or maybe a group of related functions. Heck, at this stage, we could probably incorporate Literate Programming principles.

Another feature that I don't think any existing IDE has is revision control as part of the system. And like the editing portion (“I want my editor, not the crap one the IDE provides”), revision control is another area of contention (not only over say, CVS vs. SVN, but centralized vs. decentralized, file-based vs. content-based, commenting every change vs. commenting over a series of changes, etc.). But since I'm taking a “pie-in-the-sky” approach to IDEs, I'll include revision control from within it as well.

It would probably also help with managing slightly different versions of the code base. For instance, the original version of the graylist daemon had the following bit of code to generate a report (more or less pulled from another daemon I had written):

static void handle_sigusr1(void)
{
  Stream out;
  pid_t  child;
  size_t i;

  (*cv_report)(LOG_DEBUG,"","User 1 Signal");
  mf_sigusr1 = 0;

  child = fork();
  if (child == (pid_t)-1)
  {
    (*cv_report)(LOG_CRIT,"$","fork() = %a",strerror(errno));
    return;
  }

  out = FileStreamWrite(c_dumpfile,FILE_CREATE | FILE_TRUNCATE);
  if (out == NULL)
  {
    (*cv_report)(LOG_ERR,"$","could not open %a",c_dumpfile);
    _exit(0);
  }

  for (i = 0 ; i < g_poolnum ; i++)
  {
    LineSFormat(
        out,
        "$ $ $ $ $ $ $ $ L L",
        "%a %b %c %d%e%f%g%h %i %j\n",
        ipv4(g_tuplespace[i]->ip),
        g_tuplespace[i]->from,
        g_tuplespace[i]->to,
        (g_tuplespace[i]->f & F_WHITELIST) ? "W" : "-",
        (g_tuplespace[i]->f & F_GRAYLIST)  ? "G" : "-",
        (g_tuplespace[i]->f & F_TRUNCFROM) ? "F" : "-",
        (g_tuplespace[i]->f & F_TRUNCTO)   ? "T" : "-",
        (g_tuplespace[i]->f & F_IPv6)      ? "6" : "-",
        (unsigned long)g_tuplespace[i]->ctime,
        (unsigned long)g_tuplespace[i]->atime
    );
  }

  StreamFree(out);
  _exit(0);
}

It works on all the development servers, but not the actual server.

Sigh.

Next version:

static void handle_sigusr1(void)
{
  Stream out;
#ifdef CAN_DO_FORK
  pid_t  child;
#endif
  size_t i;

  (*cv_report)(LOG_DEBUG,"","User 1 Signal");
  mf_sigusr1 = 0;

#ifdef CAN_DO_FORK
  child = fork();
  if (child == (pid_t)-1)
  {
    (*cv_report)(LOG_CRIT,"$","fork() = %a",strerror(errno));
    return;
  }
#endif

  out = FileStreamWrite(c_dumpfile,FILE_CREATE | FILE_TRUNCATE);
  if (out == NULL)
  {
    (*cv_report)(LOG_ERR,"$","could not open %a",c_dumpfile);
#ifdef CAN_DO_FORK
    _exit(0);
#else
    return;
#endif
  }

  for (i = 0 ; i < g_poolnum ; i++)
  {
    LineSFormat(
        out,
        "$ $ $ $ $ $ $ $ L L",
        "%a %b %c %d%e%f%g%h %i %j\n",
        ipv4(g_tuplespace[i]->ip),
        g_tuplespace[i]->from,
        g_tuplespace[i]->to,
        (g_tuplespace[i]->f & F_WHITELIST) ? "W" : "-",
        (g_tuplespace[i]->f & F_GRAYLIST)  ? "G" : "-",
        (g_tuplespace[i]->f & F_TRUNCFROM) ? "F" : "-",
        (g_tuplespace[i]->f & F_TRUNCTO)   ? "T" : "-",
        (g_tuplespace[i]->f & F_IPv6)      ? "6" : "-",
        (unsigned long)g_tuplespace[i]->ctime,
        (unsigned long)g_tuplespace[i]->atime
    );
  }

  StreamFree(out);
#ifdef CAN_DO_FORK
  _exit(0);
#endif
}

Ugly as hell. But typical of “portable” C code. If, however, one could easily make alternative versions (or branches) to the code, then I could, say, branch the previous version into the “Can do fork” and the “Not a forking chance” versions, then all this #ifdef crap. And by removing all that #ifdef crap, it makes it easier to follow the code.

And if you need to see all the current versions?

I guess something like FileMerge could be used to view the different revisions (and if the minimum “working unit” is the function, we get very fine-grained revision control).

And I suppose, while I'm at it, the ability to not only debug from the IDE, but edit a running instance of the program wouldn't be asking too much, although doing so for any arbitrary language may be difficult to darn near impossible.

That guy on the $10 bill? He wanted to sell out the US to the banking class …

I borrowed Don't Know Much About History from Bunny, and I must say, reading it is making me feel better about the current administration, if only from a “the more things change, the more they stay the same” type of deal.

I'm not even half way through, but some choice bits—the first about Alexander Hamilton:

Hamilton was no “man of the people,” though. The masses, he said, were a “great beast.” He wanted a government controlled by the merchant and banking class, and the government under Hamilton would always put this elite class first …

The second major component of Hamilton's master plan was the establishment of a national bank to store federal funds safely; to collect, move, and dispense tax money; and to issue notes. The bank would be partly owned by the government, but 80 percent of the stock would be sold to private investors … This time there was no compromise, and President Washington went along with Hamilton.

…

In 1791, Hamilton had become involved with a Philadelphia woman named Maria Reynolds. James Reynolds, the husband of Maria, had begun charging Hamilton for access to his wife—call it blackmail or pimping. Reynolds then began to boast that Hamilton was giving him tips—“insider information,” in modern terms—that allowed him to speculate in government bonds. Accused of corruption, Hamilton actually turned over love letters from Maria Reynolds to his political enemies to prove that he might have cheated on his wife, but he wasn't cheating the government.

On the rivalry between Thomas Jefferson and Alexander Hamilton:

As part of their ongoing feud, both men supported rival newspapers whose editors received plums from the federal pie. Jefferson's platform was the National Gazette, and Hamilton's was the Gazette of the United States, both of which took potshots at the opposition. These were not mild pleasantries, either, but mudslinging that escalated into character assassination. More important, the feud gave birth to a new and unexpected development, the growth of political parties, or factions, as they were then called.

Gee, nothing new here. In fact, it's rather interesting to note that newspapers back then were less objective in reporting the news than today. Or perhaps they were more blatent about their biases back then.

And even the endless Presidential campaigning going on now, nearly a year before the elections, isn't new:

[John Quincy Adam's] administration was crippled from the start by the political furor over the “corrupt bargin,” and Adams never recovered from the controversy. The Tennessee legislature immediately designated Jackson its choice for the next election, and the campaign of 1828 actually began in 1825.

I just wish they taughtthis history rather than the watered-down claptrap they force down students today.

So does their Bible include The Book of Mozilla?

We at the Church of Google believe the search engine Google is the closest humankind has ever come to directly experiencing an actual God (as typically defined). We believe there is much more evidence in favour of Google's divinity than there is for the divinity of other more traditional gods.

We reject supernatural gods on the notion they are not scientifically provable. Thus, Googlists believe Google should rightfully be given the title of “God”, as She exhibits a great many of the characteristics traditionally associated with such Deities in a scientifically provable manner.

We have compiled a list of nine proofs which we believe definitively prove Google's title as God.

The Church of Google

I don't know if this is real, or satire …