Thursday, February 17, 2005
Uh oh …
My other boss, R, wanted a new site added to the server down in Miami. No problem, it's not something I haven't done before. So I go through the steps needed to add the site, and in this case, the site requires its own IP address (because of the group its in). Not a problem—just pick one of the currently unused addresses, add it to the network interface and there we go. Done, I log out to do other things (like post pictures of Casa New Jersey).
Well, there's a slight problem. The Miami server is also the one that my sites are stored on, and when I went to log back in when trying to post the previous entry, I got an error I've never seen before:
ssh_exchange_identification: Connection closed by remote host
Uh oh.
I can pull up websites, so that's fine. But ssh
? FTP?
Nope. Can't log in.
I make multiple attempts, from several different locations across the Internet, trying to log in but the server is not letting me log in.
I do a Google
search on the error, and it's heartening to see that this isn't a rare
problem at all. But further reading doesn't reveal a solution to
my particular problem. No, I didn't change
/etc/hosts.allow
or /etc/hosts.deny
and it was
working earlier when I added the site and IP address—
IP address. I added an IP address to the network interface.
And I'm guessing that was enough to throw sshd
into
suspicious mode and refuse logins.
Well, fine. I knew the solution—restart sshd
(and the
FTP
server—apparently that was having fits too). But how? I can't log
in.
Ah, but there is the power pole. Log in to the power pole (through a <cough> <cough>control panel<cough> <cough>) and power cycle the outlet the server is on. Not a great solution, but hey, it'll save me a trip down to Miami.
So I call C, who is responsible for the power pole setup. But well … there's a snag. Since the last debacle the IP address of the power pole was supposed to be updated, but he wasn't sure if it was.
Great.
But C needs to get with our network engineer to actually determine that, so until tomorrow (well, later today) no one will be able to log in to the server.
Sigh.
So that didn't work
Yes, the powerbar was recofigured with a new IP address.
No, we're not sure which outlet the server is plugged into.
Well, not 100% sure.
The first time I used the power pole to remotely power cycle a server I brought down the entire cabinet down in Miami. The power pole <cough> <cough>control panel<cough> <cough> gives you several options, immediately shut off an outlet, immediately turn on an outlet, immediately reboot an outlet. I was told the server in question was in outlet 1, so I immediately shut off that outet.
And turned off the network switch.
The network switch that the power pole was plugged into.
No way to turn on outlet 1.
Well …
Live and learn (always use the “reboot” option).
C was “pretty sure” that the server was in either outlets 2 or 3, but not 1. Nor 4. Half an hour of going round and round, my final instructions were to “reboot” outlet 2, and if that didn't reboot the server, then try outlet 3. If that didn't work, then I was facing a long drive to Miami.
So I started pinging the server.
64 bytes from XXX.XXX.XXX.XXX: icmp_seq=1 ttl=51 time=36.0 ms 64 bytes from XXX.XXX.XXX.XXX: icmp_seq=2 ttl=51 time=36.0 ms 64 bytes from XXX.XXX.XXX.XXX: icmp_seq=3 ttl=51 time=36.0 ms 64 bytes from XXX.XXX.XXX.XXX: icmp_seq=4 ttl=51 time=36.0 ms
And rebooted outlet 2. The <cough> <cough>control panel<cough> <cough> showed the outlet as “Off” and about ten seconds later, the outlet was back on. But during the “reboot” I kept getting pings from the server.
64 bytes from XXX.XXX.XXX.XXX: icmp_seq=15 ttl=51 time=36.0 ms 64 bytes from XXX.XXX.XXX.XXX: icmp_seq=16 ttl=51 time=36.0 ms 64 bytes from XXX.XXX.XXX.XXX: icmp_seq=17 ttl=51 time=36.0 ms 64 bytes from XXX.XXX.XXX.XXX: icmp_seq=18 ttl=51 time=36.0 ms
Okay, that wasn't it. Try outlet 3.
64 bytes from XXX.XXX.XXX.XXX: icmp_seq=22 ttl=51 time=36.0 ms 64 bytes from XXX.XXX.XXX.XXX: icmp_seq=23 ttl=51 time=36.0 ms
Okay, that seems to have been the right outlet. Okay, power restored to the outlet. Wait about a minute for the server to come back up.
Okay, I don't recall it taking this long …
Three minutes?
64 bytes from XXX.XXX.XXX.XXX: icmp_seq=22 ttl=51 time=36.0 ms 64 bytes from XXX.XXX.XXX.XXX: icmp_seq=23 ttl=51 time=36.0 ms
Not good.
Look at the time.
4:52 pm.
Sigh.
I get to drive to Miami, during rush hour traffic.
Drive. Punch. Drive.
What's there to say about the drive to Miami? An hour and a half in heavy traffic to the NAP of the Americas, getting there just as the sun was setting (in an “economically challenged neighborhood” as the politically correct would say), ten minutes to get to the cabinet, press the power switch. Ten minutes to get back to the car, and an hour in (still) heavy traffic to get home.