Monday, Debtember 08, 2003
The Joys of taking over an existing server
“This will be a great job once I get these servers configured correctly.”
“This will be a great job once I get these servers configured correctly.”
“This will be a great job once I get these servers configured correctly.”
We still don't know why the one server is crashing. It went down three times today (well, technically Sunday as it's now 1:30 am Monday morning as I type this) and nothing was visible on the screen because Linux probably has some setting deep in the kernel to blank the screen after umpteen minutes of inactivity so the cause of the problems are never seen. That is, if anything at all is written to the console when the machine crashes (or just prior actually).
So I was tasked with moving the websites (some 1,000) off the dying
server onto a backup server, but I couldn't start until I got home at around
10:00 pm. I didn't think it would be all that bad; rsync
is
your friend and all that. I was hoping this wouldn't take more than an hour
since I have to be up and ready to go by 9:00 am Monday morning
(this morning).
So why am I still up at 1:30 am?
Because the backup server is not configured exactly like the primary server. You see, there are over 1,000 accounts (one for each website) on the primary machine, and only about 150 on the secondary machine. To make matters worse, there are some accounts on both, but their numeric ids don't match! (with the upshot that files won't be assigned their correct owners)
Lovely!
“This will be a great job once I get these servers configured correctly.”
“This will be a great job once I get these servers configured correctly.”
“This will be a great job once I get these servers configured correctly.”
Installation party
Well, I did not go to Miami today because of server problems last night (or technically, early this morning). The purpose of the trip to Miami was to retrieve two (of the four) servers I admin in order to install Gentoo and get rid of this silliness called RedHat.
The other admin ended up going down to Miami anyway and delivered the servers to my door step. Later on in the evening, Mark came over to help me with my first Gentoo installation. Gentoo is pretty neat. A “stage 1” installation (which we did) took several hours to perform, as it installs a base configuration, then downloads and recompiles everything (given the compiler options specific for the particular architecture for best performance). You can also specify what you want and more importantly, don't want.
Another interesting feature is that the base system allows you to log in
via ssh
so Mark and I spent most of the time outside in the
courtyard watching the installation via the wireless network here, and
discussing various issues.
One of which was the constantly crashing server. Mark mentioned that he
had encountered a similar problem on a friend's webserver, due to the web
log files never being rotated. Well, the server from hell has that problem in spades—over
1,000 sites and none of the logs have ever been trimmed. And we're
talking both the access_log
and the
error_log
files.
Now, I had discussed the error_log
situation with the
client—namely does each site really require its own
error_log
? The client agreed with me that no, each site did
not need said file. So after Mark left, I proceeded to nuke all
the error_log
files (since really, it's only used to debug CGI
scripts and even then, that's not a common thing). That alone cleared up
some 12 gigs of disk space. I then rotated all the access_log
files so now hopefully that server won't crash.