Tuesday, July 10, 2007
They certainly know how to time power outtages around here
Gotta love Lake Worthless Utilities.
Sometime this morning, I awoke to the squealing of a hungry UPS—it apparently had been squealing for sometime as it was both loud and frequent. I rolled out of bed and began the process of blindly shutting down my various computers. And if I do this in the wrong order, they might not shut down in time due to intercomputer dependencies (gotta love networked drives).
Five minutes after shutting everything down, the power is restored.
Sigh.
I then began the process of turning on my various computers …
There's a reason why a feature that takes five minutes to implement takes ten hours to code …
It seemed like an easy feature to add to the Quick and Dirty B-Movie Plot Generator. All I wanted was the ability to bookmark a particular plot.
What I didn't realize was that the method I used, saving the random number seed, was completely and utterly the wrong thing to do, and that the feature isn't quite as easy to implement as I first thought.
Easy to implement features, aren't.
My first clue? I “bookmarked” a particular plot on my development server:
Sheffie Les Mastenbrook is a mute world-famous card sharp from a doomed world. Susette Marie-Jeanne Sanford-Wright is a virginal green-skinned werewolf with nothing left to lose. And together, they must defuse a bomb in less than a day that's a thousand miles away.
(the bold parts? I'll get to that—and by the way, this is plot #1134)
Later on, I decided to check on the “bookmarked” plot on my deployment server, and got:
Sheffie Les Mastenbrook is a fast talking hunchbacked ad-exec on the run. Susette Marie-Jeanne Sanford-Wright is an orphaned short-sighted boxer with the power to bend men's minds. And together, they must fight crime.
Oops. (Bold text denotes differences between the two versions)
It seems that the random number generator between the two servers isn't quite the same; enough to get the names the same (which are the first things to be generated) but diverge right after that. So, if I wish to keep the URLs valid, I can't change servers, or even update the current server!
From this, I realized another problem—I can't change the data files the program uses! This stems from how I use the random numbers I get, and for this, I need a slight digression.
In C (which is what the program is written in), I call
rand()
to get a random number. rand()
takes no
parameters, and returns an integer between 0 and some upper bound
RAND_MAX
(which is a system and compiler dependant value). The
man page (on my development server, which is an older Linux distribution)
has this to say on using rand()
(sad to say, but the man page
on current Linux distributions excludes this excellent bit of advice):
The versions of
rand()
andsrand()
in the Linux C Library use the same random number generator asrandom()
andsrandom()
, so the lower-order bits should be as random as the higher-order bits. However, on olderrand()
implementations, the lower-order bits are much less random than the higher-order bits.In Numerical Recipes in C: The Art of Scientific Computing (William H. Press, Brian P. Flannery, Saul A. Teukolsky, William T. Vetterling; New York: Cambridge University Press, 1990 (1st ed, p. 207)), the following comments are made:
If you want to generate a random integer between 1 and 10, you should always do it by
j=1+(int) (10.0*rand()/(RAND_MAX+1.0));and never by anything resembling
j=1+((int) (1000000.0*rand()) % 10);(which uses lower-order bits).
Random-number generation is a complex topic. The Numerical Recipes in C book (see reference above) provides an excellent discussion of practical random-number generation issues in Chapter 7 (Random Numbers).
For a more theoretical discussion which also covers many practical issues in depth, please see Chapter 3 (Random Numbers) in Donald E. Knuth's The Art of Computer Programming, volume 2 (Seminumerical Algorithms), 2nd ed.; Reading, Massachusetts: Addison-Wesley Publishing Company, 1981.
Since I'm picking a random entry from N items, I basically do
n = (int)(((double)rand() / (double)RAND_MAX) * (double)N);
The upshot?
I add an element to the list, the value of N
changes, and
for a given output of rand()
you get a diffent answer than with
the previous, smaller N
.
And it doesn't matter where I add an item. Add a new occupation for our characters at the end of the list (on the development server) and for plot #1134 you now get:
Sheffie Les Mastenbrook is a mute world-famous inventor from a doomed world. Susette Marie-Jeanne Sanford-Wright is a virginal green-skinned vampire hunter with nothing left to lose. And together, they must defuse a bomb in less than a day that's a thousand miles away.
This “easy feature” is turning out not to be so easy. And this on a simple program. It's stuff like this that makes programming take longer than expected.
The proper solution is to record the fifteen random numbers generated and make that part of the URL, and make sure that additions to the various lists appear at the end, but we finally get “bookmarkable” plots.
Oh, I just realized—only to lose the “bookmarked plots” if the template (and thus, the number of random numbers selected) changes.
Oh, bother.