Thursday, May 18, 2000

Connan the Domain

I want to test my network monitor on a rather busy server colocated at Atlantic Internet. There's interesting traffic there and I want to make sure I'm properly decoding the traffic to monitor it.

So I try to transfer it from my home box via FTP. Only I get a connection refused. “What the?” I think. Then I look carefully at where I'm trying to transfer the file from: Connan.

Hmmmmmm …

Check out the web page. Don't bother—it's a domain name auction site if you can believe it. Only it's not other people auctioning off domains they own—it's the auction company itself auctioning off the domains. And it looks like they've wildcarded the domain. Both “www” and “linus.slab” come up with the same web page.

I wonder if I bid for “” and have a friend bid for “” who would win? Or would we both win our respective “domains?”

“Take my ticket. Please … ”

I checked snail mail again today. Yes, I don't check it often, something like a few times a month, when I remember.

Anyway, I got spammed!

Well of course everybody gets junk snail mail, but I just love the fact when the DMV releases information about speeding tickets. Since I last checked my mail, I received seventeen (that's 17) postcards and brochures from ticket clinics, lawyers, traffic schools and one comedy club traffic school (I kid you not).


It was 20 years ago today that Helen blew her top.

This is the 20th anniversary of Mount Saint Helen's first eruption.

Twenty years. I remember watching this on the news. Man it's scary how time flies.

RAID problems

Hi. My name is Agent Conner. This is my partner, Agent Grosberg. This is our story. (insert Dragnet Theme here)

At 2130 we recieved the call from John the paper millionaire of a dotcom. He has a problem. A computer problem. A major computer problem and he's calling in the experts. That's us.

It seems that there is a problem with his RAID system (Mark, I need details on the RAID system). Upon investigation it seems that the hardware is fine. It's the software that is a problem. Or rather, the operating system has a problem that leads to a corrupt file system.

Rule 1. Just because you have RAID doesn't mean your data can't get lost or corrupted.

The operating system in question is Microsoft Windows NT 4.0 Service Pack 3. There's a reason he's at Service Pack 3—it works with his RAID system, and that was hard enough to get running. His entire dotcom runs under NT. All his data, his critical data, relies upon Microsoft Windows NT to be stable.

Rule 2. No Fortune 100 Company uses Microsoft Windows NT for financial or critical applications. None.

Corollary 2: Microsoft is a Fortune 100 Company.

From our investigation we were able to asertain that Microsoft Windows NT has a problem with filesystems that contain over four million files. John the paper millionaire of a dotcom has a filesystem with over four million files. John's data is slowly being corrupted.

Rule 3. See Rule 2.

John the paper millionaire of a dotcom now knows the difficulty of using Microsoft Windows NT for a critical application. But that still doesn't help him.

Any attempt to delete, copy, move or rename the file fails with a modal dialog box popping up informing the user that the operating system cannot delete, copy, move or rename said file. You have to click “OK” to make it go away.

Rule 4. Any software that requires user intervention can't be used in a server capacity.

The backup program John uses has failed multiple times in face of said files. Therefore it is proving difficult to get a reliable backup of the four million plus files that John needs to run his business. Microsoft does have a patch available for said bug, but the time frame required to run CHKDISK is unacceptable, possibly taking up to four days to run.

Rule 5. Any backup software that cannot run in the face of errors (even if told to ignore said file and carry on) should not be used in a server capacity.

We did manage to test the GNU tar program under Microsoft Windows NT and it carried on, ignoring the corrupt files. But there doesn't seem to be a way to actually reference the tape backup unit from the command line, and there is not enough free space to backup onto disk. And the number of corrupt files seems to be relatively few, about a hundred.

But since you can't delete, move, copy or rename the files, it's hard to work around them. Another method would be to put the RAID system into read-only mode, make a backup of the RAID system (by swapping drives in and out of the hot-swappable RAID system to build a backup set of drives with the data on it, set up a separate system with said RAID backup, and go from there) but we have to see what John's bosses say to that (John became a paper millionaire of a dotcom by having his dotcom being bought out).

The case is still open …

