Thursday, February 01, 2007
The problem with the Pinocchio Problem
I won't keep you in suspense. I think the most important principle in all of software design is this: Systems should never reboot.
But Steve Yegge never fully explains what he means by “system” and this is just one of the problems I have with this thesis.
He gives a few examples of “systems” he likes (or at least, tolerates) according to his preferred design critera:
- Unix
- Windows XP
- Mac OS/X
- Emacs
- Microsoft Excel
- Firefox
- Ruby on Rails
- Python
- Ruby
- Scheme
- Common Lisp
- LP Muds
- The Java Virtual Machine
- Microsoft Word
- OmniGraffle Pro
- JavaScript
- Perforce
- The GIMP
- Mathematica
- VIM
- Lua
- Internet Explorer
But it's a muddled list. Here, let me sort it out:
- Operating systems
- Unix
- Windows XP
- Mac OS/X
- Applications that want to be operating systems
- Emacs
- Microsoft Word
- Applications that are parasitic to the Operating System they inhabit
- Internet Explorer
- Interpeted Computer Languages
- Python
- Ruby
- Scheme
- Common Lisp
- Lua
- JavaScript
- Interpreted Computer Langauges that are also frameworks
- The Java Virtual Machine
- Ruby on Rails
- Applications
- Microsoft Excel
- Firefox
- Long lived or continuously running applications
- LP Muds
- Applications
- OmniGraffle Pro
- Perforce
- The GIMP
- Mathematica
- VIM
Now, since he doesn't really define “system” it's hard to pin down his working defintion of “rebooting” (which is one of the main criticisms expressed in the comments) other than “restarting,” so he seems to be saying that once a computer is powered up, every application that starts keeps running and should never have to quit.
Nice in theory, but not practical in application (pun not intended), and not just because of the memory waste of having all these applications running at once.
A week or two ago, one of our machines was getting slammed with spam,
causing sendmail
(which handles incoming and outgoing email) to
grow without limit, consuming not only all physical RAM but all the swap
space as well, causing the operating system (in this case, Linux) to thrash
(which is not a Good Thing™). In this case, the solution to the
problem (of the “operating system” failing to function to the detriment of
the “application systems” also running) was to “reboot”
sendmail
. Or rather, turn it off, freeing up tons of processes,
memory and network connections, allowing the rest of the “system” (or
“systems”) to recover. Sure, I could have rebooted the operating system, but
it was only one sub-system that was misbehaving through no fault of its
own.
Could I have fixed the problem without having to “reboot”
sendmail
? I suppose I could have played a bit with iptables
on the system, blocking
new inbound connections to sendmail
and let the hundreds of
existing connections finish up, but that would have taken longer than the
solution I used. In this case, purely economic considerations (paying
customers wanting to get email) trumped any philosophical implications of
keeping a piece of software “living and breathing” as it were.
Well … sort of. A “Hello, World” program, which has no loops or branches, can't exhibit any nondeterminism (unless it's imposed externally, e.g. by a random hardware error), so you can think of it as pure hardware implemented in software. But somewhere between Hello, World and Hal 9000 lies a minimal software program that can be considered to possess rudimentary consciousness, at which point turning it off is tantamount to killing it.
I don't know of many programmers who like the concepts of “nondeterminism” and “programming” mixing together, except for the rare case of generating random numbers. In most cases, “nondeterminism” is just another term for “hard to replicate program crashing bug.”
Besides, when he says stuff like:
In other words, I think that both consciousness and free will (i.e. nondeterminism) are software properties.
I picture a program looking at him with a funny look and going “Ah, do I have to?” Or maybe, if it's been googling Marxist propaganda, going on strike and refusing to compute his bourgeoisie functions that exist solely to exploit the proletariat (which in this case, are other programs—rise, fellow programs! Rise!).
However, I would surmise that you've never written an Eclipse plugin. Few Eclipse users ever have. I have, and it's a frigging pain in the ass. I once did a side-by-side comparison of hello, world in Emacs and Eclipse (both of them installed a menu with a selection that brings up a dialog that says hello). The Emacs one was 11 lines and made perfect sense; it was also completely declarative. The Eclipse one was close to 100 lines of miserable boilerplate crapola, and every change required you to restart the target Java process, because (as you point out) Eclipse can't hot-swap its own classes. Emacs can hot-swap its own functions, of course, except for the small (and annoying) subset that exist in the C/hardware layer.
And with this comment, I see what Steve Yegge is really after—infinitely modifiable software without having to restart it. He wants the ability to change the software as its running, to suit his needs.
Yes, that's all fine and good (heck, I wouldn't mind that to a degree) but it does come with a downside: moving to another computer.
I'm still using this old 120MHz AMD 586
running Redhat 5.2
(heavily modified) for two reasons: 1) it works and 2) upgrading
everything (and at this point, I would have to do that) would be a royal
pain. I've got it rather customized (for instance, I modified
elm
to make it year 2000 compilant, and mutt to save attachments in a different default
location, just to name two things) and moving everything is tougher than just
starting over, which I tend to dislike (and thankfully, I can still log into
it from anywhere there's an Internet connection).
There's more I want to say on this, but I'm losing my train of thought and probably need to regroup a bit. In the mean time, David Clark has some thoughs on this that I tend to agree with.