The Boston Diaries

Monday, January 26, 2004

Technobabble technobabble technobabble

It took awhile, but I finally finished revamping my homepage. Perhaps the major reason it took so long was my enthusiam for dealing with XSLT waxed and waned over the past year. I got most of the way there by January of last year, but the resulting XML files of my site were not that well organized, and the XSLT file was a huge mess I could barely understand a few hours after writing it; it doesn't help that XSLT is quite verbose.

Just how verbose?

Before I can get there, I have to be a bit verbose myself and explain that the XML format I created looks a bit like:

<site directory="/">
  <section directory="writings/">
    <subsection directory="murphy/"> … </subsection>
    <subsection directory="hypertext/"> … </subsection>
    …
  </section>

  <section directory="photos/">
    <subsection directory="top10/"> … </subsection>
    …
  </section>
  …
</site>

The “site” (which is considered a “node”) is composed of several “sections” (again a “node”), each of which is composed of “subsections” (yet another type of “node”). Each node has a “directory” attribute, where the resulting HTML files will reside. There's a bit more (like individual pages) but that's enough to hopefully explain this wonderful bit of XSLT verbosity:

<li> <a href="../{preceding-sibling::subsection[@listindex != 'no'][position()=1]/attribute::directory}" title="{preceding-sibling::subsection[@listindex != 'no'][postition()=1]/child::title}"> Previous </a> </li>

That's one line of XSLT code there (broken up over several so you won't have to scroll all the way to the right). The nasty bit:

{preceding-sibling::subsection[@listindex != 'no'][position()=1]/attribute::directory}

comes into play when we're processing a template for a <subsection>, and in English (as best as I can translate it is):

Of the list of subsections that come prior to you in the current section, select those that do not have an attribute of listindex equal to “no” then select the first one in that list, then retrieve the value of the directory attribute.

Because if you don't specify the position(), you get the last one (which in this case would be the first subsection in the section that does not have the listindex attribute set to “no”) not the first node (even though technically it's the last node in the list of preceding nodes, and following-sibling works as expected—which makes a perverse type of sense in a Zen like way). Got it? Good. Because I barely grok it myself.

What it generates is something like:

<li> <a href="‥/murphy/" title="Murphy's Law"> Previous </a> </li>

Which is a link (within an HTML list) to the previous subsection.

That line comes in the middle of a section of XSLT code that, loosely translated into pseudocode, reads:

when in a subsection
  choose
    when listing nodes in order
      if there exists a following node that is not hidden
        print "... Next ... "
      end-if
      if there exists a  preceding node that is not hidden
        print "... Previous ... "
      end-if
    end-when
    ...
  end-choose
end-when

Only not as succinctly (I'm viewing the code in a window 144 characters wide, and each line still wraps around). COBOL is terse compared to XSLT. And imaging writing about a thousand lines like that.

I did give serious consideration to using something else other than XSLT to convert my site from XML to HTML, but the alternatives weren't much better; I could have used Perl and XML::Parser, but then I would have to explicitely crawl the resulting tree for appropriate nodes (the addressing methods in XSLT, while verbose and sometimes inexplicably odd, do make it easy to grab nodes) and the logic for generating the pages, but code to dump out nodes verbatim. For instance, I have sections like:

	
<body>
... HTML formatted as XML ... 
</body>

and to avoid having to write endless templates for things like <P> and <BLOCKQUOTE>, in XSLT, I just dump such sections out like:

<xsl:copy-of select="./node()"/>

Which does a literal copy of all the children nodes of the current node. If I were to use Perl, I would have to code this myself (the same consideration for using any other programming language with an XML parser really). Kind of six of one, half-dozen the other.

And seeing how I already had written a few thousand lines of XSLT (previous versions, revisions, etc., etc.) I decided to stick with what I started and see it through.

But now that I have this massive XSLT file, I don't really have to mess with it anymore. I can now just add content to the XML file that represents my site (I was able to add a photo gallery in about fifteen minutes of work, mostly spent typing the descriptions, without having to worry about adding navigation and images), then regenerate the site.

And speaking of navigation—back when I last overhauled my site, it was to add navigation links (thanks to Eve, who convinced me to add them), about half the XSLT I wrote was to support the navigation links (as you can see from the examples above). I have an extensive array of navigation links mostly hidden behind the <LINK> tags; if you have Mozilla, you can see them by enabling the “Site Navigation Bar” (View → Show/Hide → Site Navigation Bar). Quite a bit of work for something of perhaps dubious value? We'll see …

Swapping disks

Tonight Mark and I replaced a bad disk on swift, the colocated server currently serving up our sites. The bad disk is the system disk; the websites themselves (along with some other services we have) all reside on another disk.

There was much discussion before heading over there as to the best way to approach the problem of copying the data off the bad drive. The first method would to be install the new disk into the machine and do a disk-to-disk copy. The downside is that swift is a 1U system with no room for a third drive (no matter how temporary). Also, the unit is designed to run with the cover on—we were unsure how it would deal running uncovered. The other option would be a network based copy, from swift to another machine with the new drive in it. The problem here was speed—even though we could hook the second machine directly to swift (on the secondary ethernet port) at 100Mbps it would still take a while to copy over several gigs worth of files. We decided to take a second computer (the Windows box Spring and I share) as we decided to decide when we got to the colocation facility.

When we got there and examined swift, it was decided to use the temporary computer and do a network copy. We had some difficulty in getting the Windows box to recognize the new SCSI disk (Mark had some extra SCSI controllers and disks); it was certainly news to me that the BIOS setup was on the harddrive instead of on the ROM (much like the very old days of PCs). Once we straightened that out, it was pretty straightforward to boot Gentoo from a live CD, partition and format the new drive.

Then it was time to copy the files. It took some work to figure out how to use rsync using the rsync protocol and it still took us two attempts to get everything (first time rsync ran without root priviledges which limited the number of files copied). Once that finished (and still on the temporary machine) we recompiled the kernel to support SCSI, then set about to make the drive bootable.

The problem here was that Gentoo was a bit too aggressive in identifying hardware, and since the Linux kernel sticks USB storage devices under the SCSI layer, the harddrive ended up with an ID that it wouldn't have in the swift. We ended up having to reboot the Gentoo CD, remove the loaded USB drivers, then mount the SCSI drive, then make the drive bootable. Once that was done, the temporary system booted up without a problem.

We then removed the drive and controller, cleaned the area (so we could have room to move about) and spent a few minutes making a game plan of swapping the bad drive for the new one. The physical swap went fairly smoothly. It was reconfiguring the BIOS that proved to be rather difficult. We couldn't get into the BIOS configuration. A search of possible key sequences to get into the BIOS configuration revealed:

DEL
F1
F2
F10
Ctrl-Alt-Esc
Ctrl-Esc
Alt-Esc
INS
Esc
Ctrl-Alt-Ins

We ran down the entire list, and not one worked. Mark then had the brainstorm to hold down the keys as the machine was powered up. First key he tried, DEL got us into the BIOS.

Talk about having plenty of time to get into the BIOS configuration.

Once the BIOS was configured with the new drive, it rebooted without a problem.

All told, we spent maybe five hours doing the drive swap, with the websites unavailable for maybe fifteen minutes tops. It was a bit scary at times though, watching the copying go with numerous disk errors. But so far, nothing important seems to have been corrupted, unlike most of the files in Mark's home directory (but he had current backups of that data anyway).

Monday, January 26, 2004

Technobabble technobabble technobabble

Swapping disks

Obligatory Picture

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer