The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Friday, April 14, 2000

“That which does not kill us, hurts like hell!”

Mark and I stopped off at Atlantic Internet after a dinner meeting to find one of the techs still there fiddling with his IOpener. We fiddled around with that, then I showed Mark one of the computers a customer I'm doing work for has.

The IOpener is small. The server I showed Mark was not. This is a large machine, dual Pentium III with one gig of RAM (a gigabyte!) and some 30 gigabytes of RAID-5 storage (small these days, I know).

I'm doing some work for this customer and I had noticed that the 30G of storage wasn't mounted on the server. So, as long as I was there, might as well mount the RAID array. Mark, having a RAID array at home, was on hand to help with the consulting.



megaraid: v107 (December 22, 1999)
megaraid: found 0x101e:0x9010:idx 0:bus 0:slot 9:func 0
scsi0 : Found a MegaRAID controller at 0xd810, IRQ: 17
megaraid: [UF80:1.61] detected 1 logical drives
scsi0 : AMI MegaRAID UF80 254 commands 16 targs 1 chans 8 luns
scsi : 1 host.
scsi0: scanning channel 1 for devices.
scsi0: scanning virtual channel for logical drives.
  Vendor: MegaRAID  Model: LD0 RAID5 35000R  Rev: UF80
  Type:   Direct-Access                      ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 1, id 0, lun 0
SCSI device sda: hdwr sector= 512 bytes. Sectors= 71680000 [35000 MB] [35.0
GB]
 sda: sda1 sda2 sda3 <sda5 sda6 sda7>
(scsi1) <ADAPTEC AIC-7890/1 ULTRA2 SCSI HOST ADAPTER> found at PCI 12/0
(scsi1) Wide Channel, SCSI ID=7, 32/255 SCBs
(scsi1) Downloading sequencer code... 385 instructions downloaded
scsi1 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.23/3.2.4
       <ADAPTEC AIC-7890/1 ULTRA2 SCSI HOST ADAPTER>
scsi : 2 hosts.


From that, it looked like there were two disk controllers. The system was booting from SCSI, that much was apparent. What wasn't apparent was the location of the RAID system.

The BIOS POST also gave the impression of two controllers. We went into the RAID BIOS extention, initialized the RAID controller and drives and then rebooted the system.

Turns out that the megaraid and the Adaptec SCSI controller are one in the same and that the system itself (it runs Linux) was booting off the RAID controller!

It is through our mistakes that we learn.

And it is through grovelling that we retain our customers.

Fortunately, the customer didn't loose any important data (the customer wasn't using it fully at the time), nor did he mind that much (“Next time, please consult with me before you do any irrepairable configuration changes. Okay?”).

That, and I didn't like the way Linux was installed on the box to begin with.


“Don't Panic!”

While Mark and I were doing a fast recovery of a customer machine we received a call from John, the paper millionaire of a dotcom company and former member of a Grateful Dead cover band to say he couldn't get to his servers, located in the very same co-location facility we were currently at.

Mark goes over to John's machines. All servers are up, but he can't ping out. In fact, he can't get past the first hop. Mark then heads over to the core room, I remain in the co-location room, and we all get on a conference call.

Network seems okay—link light is on at both ends of the connection. No traffic. Jiggle the cord. Oh! A few packets. Then major lossage again. Repeat.

John is freaking out because he needs to be on a plane early and it's now 3:30 am or there abouts. He finally conferences in the main sysadmin for Atlantic Internet because Mark and I can't figure out what's going on.

Neither could the sysadmin. Everything seems okay. Only there's no traffic. John, panicing is yelling at Mark. Mark is yelling back at John not to panic. Meanwhile we can barely hear the sysadmin over the conference call. Pandemonium reigns.

I quickly grab the network analyzer they have (way too cool) an hook it to John's side of the connection. It lights up like a Christmas tree. Low utilization, high collisions and an even larger rate of errors. I then take the unit to the Atlantic Internet side. Nothing. Normal traffic from John's servers.

We then plug the network analyzer into the Cisco Catalyst 5000 which is serving as the main switch. Actually, it's more like three switched hubs than a real switch—there are 24 ports grouped into three sections. Each section is a hub, but switched between sections.

The network analyzer lights up like a Christmas tree.

The consensus seems to be that the Catalyst is hosed. It probably didn't survive a DoS attack a few days previously and was slowly going bad. So it was some quick work to rerun a few cables to nearby switches and remove the Catalyst from service.

Mark and I didn't leave the office until 5 am.

Obligatory Picture

[It's the most wonderful time of the year!]

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: http://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

http://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2019 by Sean Conner. All Rights Reserved.