Sunday, August 04, 2013
Programs from the past, Part II
I could have sworn I wrote a bit more about a program I wrote while at FAU; specifically, just how long it took to run back then vs now (or rather, a few years ago), but having failed to find any using Google or locally searching the entries on the server, I ended up spending all day yesterday reading through ten years of archives and still not finding anything, I guess I haven't.
Well. [Deep subject. —Editor] [Shut up you! —Sean]
So I wrote a series of programs to search through the domain of the following pair of equations:
xi+1 = (Ayi + B) xi (1 - xi)
yi+1 = (Cxi + D) yi (1 - yi)
The intent was to generate a large number of (x y)
pairs and
plot the results for a given set of values for A
,
B
, C
and D
. Something like:
The top window shows the resulting chaotic attractor when
A
is 2.4375, B
is 1.5624, C
is
-0.8659 and D
is 4.0 as you loop through the above two
equations 15,000 times to generate 15,000 points (all between 0 and 1 for
both the x
and y
axis). And the thing about a
chaotic attractor is, the initial starting point (x0
and y0
) is largely immaterial (but not quite, as the
starting values must be between somewhere between 0 and 1). Any given value
for x0
and y0
will result in
the same figure being drawn for the given values of A
,
B
, C
and D
.
The program shown allows you to change the settings of the four
controlling variables A
, B
, C
and
D
by sliding the named boxes horizontally (the botton one
allows you to slide multiple variables at the same time—in this case,
A
and C
will slide to the left if you slide the
bottom box to the left).
That was fine, but my bosses at the time wanted another view generated—a “map” of the chaotic attractors, if you will. To generate this “map,” first you calculate how many points are generated before you settle into an attractor:
int mainloop(const double A,const double B,const double C,const double D) { double xn,xn1; double yn,yn1; int ix; int iy; int count; bool pix[DIM][DIM]; /* have we seen this (xy) pair yet? */ memset(pix,0,sizeof(pix)); for (count = 0 , xn = yn = 0.5 ; count < MAXLOOP ; count++) { xn1 = ((A * yn) + B) * xn * (1.0 - xn); yn1 = ((C * xn) + D) * yn * (1.0 - yn); xn = xn1; yn = yn1; /* exit if we're outside the bounds */ if (xn < 0.0) return MAXLOOP; if (xn >= 1.0) return MAXLOOP; if (yn < 0.0) return MAXLOOP; if (yn >= 1.0) return MAXLOOP; ix = (int)(xn * DIM); iy = (int)(yn * DIM); if (pix[ix][iy]) break; pix[ix][iy] = true; } return(count); }
The assumption—if this returns MAXLOOP
, there is no
attractor for the given set of values for A
, B
,
C
and D
; othersise it's an indication of how
quickly (or how “showey”) we settle into the pattern.
Then it's a matter of going through a range of values and for the map images, since they're two dimensional, I only sweep through two of the four governing variables over their range:
for (A = -4.0 ; A <= 4.0 ; A += 0.016) for (B = -4.0 ; B <= 4.0 ; B += 0.016) plot(A,B,count_as_color(mainloop(A,B,-0.8659,4.0));
(count_as_color()
just maps a count to a color)
And we end up with an image something like:
In this image, A
is the horizontal axis, going left to right
from -4.0 to 4.0, and B
is the veritcal axis, going, bottom to
top, from -4.0 to 4.0; C
and D
are fixed, at
-0.8659 and 4.0 respectively. The red plus in the upper right hand quadrant
is the location of the “French horn” attractor above. The black area
represents areas that have no chaotic attractor.
But my bosses didn't just want one map—no, they wanted a
larger set. So, I generated a series of images as this, each image having a
different value for C
. And while the map may not change
drastically—for instance, if we bump C up a notch to -0.8499:
The resulting chaotic attractor is completely different:
And I generated a ton of maps by looping through values of C
from -4.0 to 4.0—500 images in total.
Now, when I did this the first time, twenty years (or a bit more—maybe up to twenty-two years ago) I was doing this on a state of the art Unix workstation, an SGI workstation (perhaps you've seen their commerical?). It cost $30,000 and it took a year to generate the 500 images. Each individual image took some ten hours to generate.
Fast forward to today. On a laptop that is maybe two years old now. I reran the code and was able to generate an entire image (in fact, both of the map images above) in only 8 seconds.
With four cores, that means, on a two year old laptop that might have cost $1,500 (tops) would take 17 minutes what it took me a year to generate twenty years ago.
Oh, and about those eight seconds …
Update on August 5th, 2013
Oops, I made a slight mistake …
Update on Tuesday, August 6th, 2013
About those eight seconds
In my previous post, I mentioned a program that took a year to run twenty-some years ago on a then state-of-the-art workstation could now be done in 17 minutes on a two year old laptop.
What I didn't mention is that the program twenty-some years ago was written in C and the program I ran today was written in Lua.
Yes, computers are so fast these days that a scripting language can out-perform computers from two decades ago.
“Okay,” you say. “But I won't want to wait 17 minutes for my data.”
Okay, fine. I see two options, and let's try the first option and one that most people would do—drop down to C. And yes, that does give us an improvement, an impressive improvement—only 2.5 seconds per frame, and across four cores that means you'll have the results in a little over five minutes. Not that bad.
The other option, and hear me out—is to take our Lua code and run it via LuaJIT, a (pretty much) drop in replacement for Lua that compiles down to native code. Even if it's a bit slower than C, it should still be faster than Lua with no code changes.
So how does LuaJIT fare?
Personally, I was expecting the C version (which I actually wrote first) to be faster, if only buy a little bit, but …
So, here's the C version (which generates a single image):
[spc]saltmine:~/source/play>time ./a.out >/dev/null real 0m2.483s user 0m2.470s sys 0m0.000s
And now the LuaJIT version:
[spc]saltmine:~/source/play>time luajit amap.lua >/dev/null real 0m0.849s user 0m0.840s sys 0m0.000s
And no, that's not a mistake (belive me, I checked and rechecked)—here:
[spc]saltmine:~/source/play>time lua amap.lua >/dev/null real 0m8.091s user 0m8.060s sys 0m0.000s [spc]saltmine:~/source/play>time luajit amap.lua >/dev/null real 0m0.856s user 0m0.850s sys 0m0.000s
So … um … I can have that data to you in two minutes? Is that fast enough?
On reflection, it makes sense that LuaJIT will outperform C in this case. It's heavily CPU bound and the fact that the main function:
function mainloop(A,B,C,D) local pix = {} local xn = 0.5 local yn = 0.5 for count = 1 , MAX do local xn1 = ((A * yn) + B) * xn * (1.0 - xn) local yn1 = ((C * xn) + D) * yn * (1.0 - yn) xn = xn1 yn = yn1 if xn < 0 then return MAX-1 end if xn >= 1 then return MAX-1 end if yn < 0 then return MAX-1 end if yn >= 1 then return MAX-1 end local ix = math.floor(xn * DIM) local iy = math.floor(yn * DIM) local f = iy * DIM + ix -- Lua doesn't really do N-dimensional arrays if pix[f] then return count-1 end pix[f] = true end end
can be recompiled per call to take advantage of the paramters. For
instance, when A
and B
are both 0, the first
expression then becomes:
local xn1 = xn * (1.0 - xn)
and given that I'm doing 32,768 interations of this (oh, did I fail to mention that? Yes, I'm doing 27,768 more interations than the code did twenty-some years ago) this does save quite a bit of time.
Update on August 5th, 2013
Oops, I made a slight mistake …
Update on Tuesday, August 6th, 2013
Monday, August 05, 2013
About those eight seconds, part II: mea culpa
The images presented in the previous two posts have been mostly reconstructions. The only piece of software I have from that project twenty-some years ago is the piece that drew the chaotic attractors, and that was specific to the SGI and had to be rewritten (extensively) for a modern system.
The other program—the one that generated the maps? I have no clue as to where that code went. Or rather, why I neglected to keep a copy of it. So I had to basically recreate that program from scratch. The original reconstruction started out as:
int mainloop(const double A,const double B,const double C,const double D) { double xn,xn1; double yn,yn1; int ix; int iy; int count; bool pix[DIM][DIM]; memset(pix,0,sizeof(pix)); for (count = 0 , xn = yn = 0.5 ; count < MAXLOOP ; count++) { xn1 = ((A * yn) + B) * xn * (1.0 - xn); yn1 = ((C * xn) + D) * yn * (1.0 - yn); xn = xn1; yn = yn1;
But since I was setting memory and not plotting points, I needed
to be very careful about the values of xn
and yn
.
I remembered that the values needed to be between 0 and 1, so attempt one I
had:
assert(xn >= 0.0); assert(xn < 1.0); assert(yn >= 0.0); assert(yn < 1.0);
But those assumptions were proven false rather quickly. Attempt two I
decided to print the values of xn
and yn
and as it
happened, very quickly (like on the first pass) the values hit
+inf
. I interpreted that to mean I could bail early,
and thus finished the reconstruction with attempt three:
if (xn < 0.0) return MAXLOOP; if (xn >= 1.0) return MAXLOOP; if (yn < 0.0) return MAXLOOP; if (yn >= 1.0) return MAXLOOP; ix = (int)(xn * DIM); iy = (int)(yn * DIM); if (pix[ix][iy]) break; pix[ix][iy] = true; } return(count); }
But the sharp edges in the maps were bothering me. So much so that I
rewrote the code to continue the loop if the values of xn
and
yn
were out of range, thinking that maybe they might come back
into range.
And yes, they did:
The downside—this revised code now takes a bit longer to run. The LuaJIT version is still faster than the C version, but the difference (on average 1 minute for the LuaJIT version; 1 minute, 30 seconds for the C version) isn't as jaw-droppingly great.
Sigh—it's not quite as fast as I made it out to be.
Update on Tuesday, August 6th, 2013
Yes, it is as fast as I think it was …
Tuesday, August 06, 2013
Programs from the past, Part III: It's a numbers game
I'm still not done yet. When last we left off, the recreation of a project written twenty-some years ago wasn't as fast as I thought. But as I shall reveal, there's a bit more lurking here than I realized.
For a fair comparison, I actually have two versions of a program that generates those map files:
One version in C and one in Lua (previously, the recreated programs just spit out numbers that were then converted to an image with a second program). And here's how long it took to generate the results:
Version | Timings |
---|---|
Original program | ~36,000s |
Lua | 1,096s |
C | 79s |
LuaJIT | 69s |
(remember, the “Original program” was written twenty-some years ago and ran on a 32-bit 33MHz machine; the programs I just wrote are running on a 64-bit 2.8GHz machine)
My initial mistake in recreating this program was to toss out values that exceeded a range and from that, I got fantastic timings, but I was throwing out too much, as can be seen here:
The “too-trimmed” version came about because I looked at some of the intermediate results and saw infinities (well, the IEEE 754's concept of infinity) and at that point, any operations done on “infinity” is “infinity.”
But my mistake was thinking that all values that fell out of range would end up being “infinite.” Not all did, and some even fell back into range.
But I was still troubled by the infinite results. So, why not explicitely check for “infinity?” In Lua/LuaJIT:
if xn == math.huge or xn == -math.huge then return MAX end if yn == math.huge or yn == -math.huge then return MAX end
And in C:
if (xn == HUGE_VAL) return MAX; if (xn == -HUGE_VAL) return MAX; if (yn == HUGE_VAL) return MAX; if (yn == -HUGE_VAL) return MAx;
(for those curious, HUGE_VAL
is defined in
math.h
.)
Adding those lines makes all the difference:
Version | Timings |
---|---|
Original program | N/A |
Lua | 16.25s |
C | 2.60s |
LuaJIT | 1.25s |
It's nice to see LuaJIT still beating the snot out of C, and yes, I was able to generate a full set of maps, just like I did twenty-some years ago, in under three minutes (remember, I'm running the code across four CPUs).
But now I'm upset, because checking for “infinity” was something I didn't do twenty-some years ago, and now I'm thinking, what if I had? Could that simple check, had I known about it, cut the run time of a year down to a month?
I can't blame the university for not offering a class on floating point arithmetic, because it did! And worse, I took the class! (when I entered FAU as a freshman, I knew that the first class for the Computer Science degree involved Fortran. I found the first class that had “Fortran” as part of the title and signed up for it, only to spend find the lectures spending more time on the horrors of floating point arithmetic and very little time on programming. It turned out I took the wrong class; what I signed up for was a Fortran class taught out of the Math Department (the very one I worked for when I wrote these programs twenty-some years ago) for mathematicians. When I discovered the mistake, I was able to get out of the class without issue, but only becuase I was actually doing quite well. In my defense, I was a freshman and didn't have my act together; but FAU didn't exactly have a Computer Science Department at the time, so it didn't have its act together either!). It never dawned on me to even check the intermediate results and bail early.
Sigh.
Thursday, August 08, 2013
World War II according to Hollywood
A wonderful explanation of World War II as done by Hollywood. Indiana Jones keeping the Ark of the Covenant away from the Nazis, or the Japanese forcing Obi-Wan to build a bridge—it's all there.
Programs from the past, Part IV: There Ain't No Such Thing As the Fastest Code
The point I want to make, though, is that the biggest optimization barrier that Terje faced was that he thought he had the fastest code possible. Once he opened up the possibility that there were faster approaches, and looked beyond the specific approach that he had so carefully optimized, he was able to come up with code that was a lot faster. Consider the incongruity of Terje's willingness to consider a 5 percent speedup significant in the light of his later near-doubling of performance.
Michael Abrash, Zen of Code Optimization
I'm probably belaboring the inanimate equus pleonastically, but this isn't the first time I've done so, so why stop now?
As was pointed out to me, I missed an obvious optimization in the map-generation program. That small change drastically changed the runtime of the C version of the program:
Version | New Time | Previous Time |
---|---|---|
Lua | 15.50s | 16.25s |
LuaJIT | 0.65s | 1.25s |
C | 0.22s | 2.60s |
Some other testing revealed that I can now generate in 37 seconds (remember, four cores running a stupidly parallel problem) that twenty-some years ago took a year to do—generate 500 map files.
And speaking of 37 seconds … thanks to the Classic Computers Talk Mailing List, I was able to run this latest code on a nearly identical SGI machine that I used twenty-some years ago—a 33MHz R3000 running IRIX 4.0.5.
Back then, the generation of one map took around ten hours, but I had missed a few optimizations that could have help speed up the program. And yes, it's true—the optimizations I've since added, plus using a high optimization setting on the C compiler, resulted in a map file being generated in 37 seconds.
On a twenty-some year old 33MHz machine.
Head. Desk.
The entire program run, which took a year, could have been done in less than 5¼ hours!
I think it's time to stop belaboring this inanimate equus.
Thursday, August 15, 2013
Nobody expects the Sleep Paralysis Inquisition!
I had fallen asleep at the hotel (oh yes, Bunny and I had to abandon Chez Boca for the night because of plumbing issues—no damage, but no usage of either bathroom) when it felt like Bunny slid into bed behind me. I cracked open an eye, which was tough, as I was very tired, and yet, there she was, in the other bed!
But I swore I could have felt someone—
Nay! What Devilry is this? That was when I felt the bed moving unnaturally behind me, and lo, above my head, jutting out just far enough for me to see, was the metal leg of the bed looming. Further more, said Devilry had paralyzed me.
I attempted to scream. “BUNNY! BUNNY!” But all that was coming out was “Bunny. Bunny.”. But she was unmoving, still consumed by her book. From what I could see, all she was hearing was “Zzzzzzzzzzz.”
I just knew the bed was gyring and gimbling behind me, waiting, plotting, to suck me into the Abyss. I needed to flee. Struggling against paralysis, I was able to stand up, but the bed covers were smothering me, entangling me as I attempted to move. I fell flat, face first onto the floor, which surprisingly, did not hurt. But it did knock the breath out of me and I passed out.
When I came to, I was still in the bed. Bunny was still reading her book, having known not of my plight with the Hellmouth forming behind me, nor of my failed attempt to escape it.
It was then I realized the Devilry for what it was—sleep paralysis. I may have to rethink what triggers my sleep paralysis attacks, since I was not napping when this happened.
Update on Saturday, August 17th, 2013
In case you are curious as to the plumbing issue at Chez Boca, the main drainage pipe under the house was blocked up. The services of a plumber were required to clear the clog, and even with the expense of a hotel room for the night, it was still cheaper to have the plumber do the work the next day than to pay the “Emergency Rate” Wednesday night.
As Bunny says, “the joys of being a land baron!”
Apparently, I have products!
- From
- "Dr. Glidden Kennedy" <cucdtk.tp@hoabinh.gov.vn>
- To
- (Obviously not me) cucdtk.tp@hoabinh.gov.vn
- Subject
- Having gone through your listed products
- Date
- Thu, 15 Aug 2013 20:48:31 +0700 (ICT)
Good Day ,
Having gone through your listed products, we offer great interest to do a purchase agreement with your company. Please get back to us with the following information Please Quote.
1. Prices FOB
2. Payment terms
3. Delivery Period
[There is no 4. Spoons! –Sean]
5. Specified delivery date assuming from the Date of Order.Please your quick reply will be highly appreciated to our private Email (ptamex412@yahoo.com)
Regards,
Mr. Woody Allen
Purchasing Operations Manager.
PT AMEX Corporation
I have listed products to offer?
Really? [Yes, your Amazon affiliate links. –Editor]
Oh, yeah.
But that aside, what's the angle here? I mean, random spamming of companies looking to buy unspecified items? Um … perhaps inserting themselves as middle men, advertising your products as theirs, passing on the orders they get to you to fulfill, and charging their “customers” more money?
Perhaps?
Either that, or Mr. Allen is really desperate for money.
And assuming he is, Mr. Allen, here are my answers:
- FOB EXW Boca Raton (Incoterms 2010).
- Cash up front, no credit. Payable as US dollars or an equivalent amount in gold or silver.
- Please allow four to six weeks for delivery.
- We don't talk about Flight Club (think happy thoughts—I'm expecting only four people to get this).
- Um, what part of “Please allow four to six weeks for delivery” did you not understand?
That should work.