Saturday, February 17, 2007
Old papers
I enjoy reading old software manuals. Part of that is to marvel at what was done back when CPUs were slow, memory was small, and disks were the size of washing machines (or later, shoe boxes). It's also amazing to see what was attempted that may have failed at the time due to marketing or machine limitations (for instance, Cornerstone's separation of user names from internal names was innovative, but it probably didn't hit real mainstream use until the early 2000s in IDEs as refactoring code became popular).
So I have with me The Bell System Technical Journal, Volume 57, Number 6, Part 2 (July–August 1978), which is all about the Unix Time Sharing System™ (which I chanced upon about ten years ago at Booksmart, a used book store in Boca Raton, Florida—at the time it was off 20th just west of FAU but since has moved to Dixie and Spanish River) and I'm reading “A Retrospective” by Dennis M. Ritchie and man, is it quote-worthy.
A pipe is, in effect, an open file connecting two processes; information written into one end of the pipe may be read from the other end, with synchronization, scheduling, and buffering handled automatically by the system. A linear array of processes (a “pipeline”) thus becomes a set of coroutines simultaneously processing an I/O stream.
So I'm not the first to notice that coroutines look a lot like Unix pipes. Also interesting is this bit further down:
The shell syntax for pipelines forces them to be linear, although the operating system permits processes to be connected by pipes in a general graph. There are several reasons for this restriction. The most important is the lack of a notation as perspicuous as that of the simple, linear pipeline; also, processes connected in a general graph can become deadlocked as a result of the finite amount of buffering in each pipe. Finally, although an acceptable (if complicated) notation has been proposed that creates only deadlock-free graphs, the need has never been felt keenly enough to impel anyone to implement it.
Really? Back in college, I implemented a Unix shell and noticed
that it would be rather trivial (except somewhat limited by a decent
syntactical notation) to implement not only bi-directional pipes, but
creating pipes for stderr
as well as stdout
. It
never crossed my mind that deadlocks were possible. I would also be curious
to see the notation that was proposed, since it could very be applicable to
new languages with builtin concurrency.
Of course, there're always the very amusing bits:
Both input and output of UNIX programs tend to be very terse. This can be disconcerting, especially to the beginner. The editor [
ed
most likely; a line oriented editor —Editor], for example, has essentially only one diagnostic, namely “?”, which means “you have done something wrong.” Once one knows the editor, the error or difficulty is usually obvious, and the terseness is appreciated after a period of acclimation, but certainly people can be confused at first. However, even if some fuller diagnostics might be appreciated on occasion, there is much noise that we are happy to be rid of.
I have to wonder if this is the genesis of the following story:
Ken Thompson has an automobile which he helped design. Unlike most automobiles, it has neither speedometer, nor gas gage, nor any of the numerous idiot lights which plague the modern driver. Rather, if the driver makes any mistake, a giant “?” lights up in the center of the dashboard. “The experienced driver,” he says, “will usually know what's wrong.”
which is funny, since the paper I'm quoting was written solely by Dennis Ritchie. Anyway, Mark and Wlofie will appreciate this bit:
Two events—running out of swap space, and an unrecoverable I/O error during swapping—cause the system to crash “voluntarily,” that is, not as a result of a bug causing a fault. It turns out to be rather inconvenient to arrange a more graceful exit for a process that cannot be swapped. Occurrence of swap-space exhaustion can be made arbitrarily rare by providing enough space, and the current system refuses to create a new process unless there is enough room for it to grow to maximum size. UNrecoverable I/O errors in swapping are usually a signal that the hardware is badly impaired, so in neither of these cases do we feel strongly motivated to alleviate the theoretical problems.
… It must be admitted, however, that the system is not very tolerant of malfunctioning hardware, nor does it produce particularly informative diagnostics when trouble occurs.
It appears that the lack of hardware stability in Linux has a long pedigree, leading back to the original implementation of Unix (another amusing bit—“the typical period between software crashes is well over a fortnight of continuous operation.”—heh).
The entire paper is currently online for your perusal (and it's very interesting to note that the book I have, The Bell System Technical Journal, Volume 57, Number 6, Part 2 is no longer available from anyone, I guess making the copy I have a collectable. Hmmmm … )