The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Thursday, May 18, 2000

RAID problems

Hi. My name is Agent Conner. This is my partner, Agent Grosberg. This is our story. (insert Dragnet Theme here)

At 2130 we recieved the call from John the paper millionaire of a dotcom. He has a problem. A computer problem. A major computer problem and he's calling in the experts. That's us.

It seems that there is a problem with his RAID system (Mark, I need details on the RAID system). Upon investigation it seems that the hardware is fine. It's the software that is a problem. Or rather, the operating system has a problem that leads to a corrupt file system.

Rule 1. Just because you have RAID doesn't mean your data can't get lost or corrupted.

The operating system in question is Microsoft Windows NT 4.0 Service Pack 3. There's a reason he's at Service Pack 3—it works with his RAID system, and that was hard enough to get running. His entire dotcom runs under NT. All his data, his critical data, relies upon Microsoft Windows NT to be stable.

Rule 2. No Fortune 100 Company uses Microsoft Windows NT for financial or critical applications. None.

Corollary 2: Microsoft is a Fortune 100 Company.

From our investigation we were able to asertain that Microsoft Windows NT has a problem with filesystems that contain over four million files. John the paper millionaire of a dotcom has a filesystem with over four million files. John's data is slowly being corrupted.

Rule 3. See Rule 2.

John the paper millionaire of a dotcom now knows the difficulty of using Microsoft Windows NT for a critical application. But that still doesn't help him.

Any attempt to delete, copy, move or rename the file fails with a modal dialog box popping up informing the user that the operating system cannot delete, copy, move or rename said file. You have to click “OK” to make it go away.

Rule 4. Any software that requires user intervention can't be used in a server capacity.

The backup program John uses has failed multiple times in face of said files. Therefore it is proving difficult to get a reliable backup of the four million plus files that John needs to run his business. Microsoft does have a patch available for said bug, but the time frame required to run CHKDISK is unacceptable, possibly taking up to four days to run.

Rule 5. Any backup software that cannot run in the face of errors (even if told to ignore said file and carry on) should not be used in a server capacity.

We did manage to test the GNU tar program under Microsoft Windows NT and it carried on, ignoring the corrupt files. But there doesn't seem to be a way to actually reference the tape backup unit from the command line, and there is not enough free space to backup onto disk. And the number of corrupt files seems to be relatively few, about a hundred.

But since you can't delete, move, copy or rename the files, it's hard to work around them. Another method would be to put the RAID system into read-only mode, make a backup of the RAID system (by swapping drives in and out of the hot-swappable RAID system to build a backup set of drives with the data on it, set up a separate system with said RAID backup, and go from there) but we have to see what John's bosses say to that (John became a paper millionaire of a dotcom by having his dotcom being bought out).

The case is still open …

Obligatory Picture

[It's the most wonderful time of the year!]

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: http://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

http://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2019 by Sean Conner. All Rights Reserved.