Tuesday, February 08, 2005
My life in a bad Star Trek episode
Ah, “Project White Elephant.” Haven't talked about that in a while, but then, I haven't had much involvement with it for a while.
Until today.
The current sub-project is pretty straight forward. We have two
machines, let's call them chip
and dale
(again,
the names have been changed to protect the guilty me, but
thematically, the pseudonyms are close enough). chip
is the primary
machine, which handles email, the websites, DNS, what have you, for
“Project White Elephant.” dale
is there solely to pick up if
(or when) chip
dies.
Now, chip
has an IP address of C, and dale
has an IP address
of D. The services, DNS,
SMTP, POP, IMAP and HTTP (among others), are not bound to address C or D,
but to V. chip's
network card is programmed to listen to V,
but in the event that chip
goes down, dale
will
then configure its card to listen on V and take over DNS, SMTP, POP, IMAP and HTTP (among others).
Not trivial, but doable and the details can be a bit tedious.
Simplifying things a bit, you need to make sure that the configuration of
all the services on chip
are copied over to dale
and that you can start the services on dale
without error. Oh,
you also need to replicate any datafiles (websites, email, etc) from
chip
to dale
.
But …
Sadly, we're using Blech
, a … <shudder> …
control panel (yes, yes, I know … I said we weren't going to use a control panel for “Project
White Elephant”—that order has since been rescinded—sigh).
Sure, a control panel makes simple yet tedious operations, such as configuring a new site on the system, easy and relatively painless. But attempt to do anything out of a proscribed set of procedures and basically, it ends up either being too difficult or outright impossible, that otherwise would be possible without the contraints of a control panel.
Today's particular problem had to do with site replication from
chip
to dale
. We could configure the site on
chip
, and it even got pushed out to dale
but the
configuration of the webserver (Apache) wasn't replicated properly and
sites pushed out to dale
would end up with an IP address of D and not V.
I was on the phone with one of our contacts in “Project White Elephant” and I swear, the phone conversation was straight out of a bad Star Trek episode:
“Okay, we can set up the routers to preferentially route V to
chip
and then if it goes down, switch over to
dale
.”
“Wouldn't that require the configuation of RIP on the servers to initiate
the router switch from chip
to dale
?”
“Yes, you're right. We can't do that. How about creating a special
instance of DNS on
dale
such that if chip
goes down,
dale
picks up, and—”
“Terrible idea. Each DNS
change on chip
requires tracking that and updating the private
copy on dale
”
“Sure, but it can be scripted.”
“But Doctor, you miss the point. Zone A has a serial number of N.
chip
goes down. dale
takes over, making sure to
update the serial number of A to N plus 1. chip
then comes
back up and starts serving out zone A with a serial number of N.”
“And of course, other DNS servers would ignore that since it's an older serial number, meaning—”
“We'd have to update the zones back into chip
such
that Blech
will accept it, and remember, Blech
stores everything in a database and overwrites the DNS configuration
files—”
“Meaning we'd have to script the changes back into the database
or manually update the information through Blech.
”
“Exactly.”
“Okay, what about not running Blech
on dale
?
Then just copy the sites, email and zones over?”
“Then you would have to either translate the configuration from
chip
to the non-Blech
configuration we set up on
dale
, which means that K [being the
admin who is responsible for running these boxes and can't use the command
line—don't ask] won't be able to handle that box. Or we
set it up just like Blech
.”
“Dash it all! Okay, what about reversing the polarity of the flux capacitor and letting the backwash flow into the Jeffries Tubes?”
“Nice in theory, but you know what they say about theory and practice, right?”
“ ‘In theory, there is no difference between theory and practice, but in practice, there is.’ ”
“Exactly. Do that, and you run a risk of the back pressure rupturing the Jeffries Tubes, and let's not even get into the problem of stuck bits on the condensor plate if you reverse the polarity of the flux capacitor.”
“You're right! I forgot about that!”
Only the conversation was much longer. And not as interesting.
In the end, the only real sticking point was the websites. DNS isn't a problem if we can assign
the services to address V. SMTP isn't a problem since you can set the MX records for incoming email to do the
right thing. And since we did find a way to replicate the existing
mailboxes between the two machines (which doesn't impinge on any
configurations) and assuming we can get the IP address switch over working, then POP and IMAP aren't real issues (and in that
reguard, Blech
does seem replicate most of the site
data between chip
and dale
, stuff like users and
what not). That leaves HTTP. And a quick test of simplying copying the web
server configuration and the ability to start and stop the webserver via the
command line (which amazingly enough is a standard command!) I think we can
pull this off.
Now only if we could do something about those stuck bits on the condensor plate …