## Sunday, August 04, 2013

### Programs from the past, Part II

I could have sworn I wrote a bit more about a program I wrote while at FAU;
specifically, just how long it took to run back then vs now (or rather, a
few years ago), but having failed to find any using Google or locally searching the entries on
the server, I ended up spending all day yesterday *reading* through
ten years of archives and *still* not finding anything, I guess I
haven't.

Well. [Deep subject. —Editor] [Shut up you! —Sean]

So I wrote a series of programs to search through the domain of the following pair of equations:

x_{i+1} = (Ay_{i} + B) x_{i} (1 - x_{i})

y_{i+1} = (Cx_{i} + D) y_{i} (1 - y_{i})

The intent was to generate a large number of `(x y)`

pairs and
plot the results for a given set of values for `A`

,
`B`

, `C`

and `D`

. Something like:

The top window shows the resulting chaotic attractor when
`A`

is 2.4375, `B`

is 1.5624, `C`

is
-0.8659 and `D`

is 4.0 as you loop through the above two
equations 15,000 times to generate 15,000 points (all between 0 and 1 for
both the `x`

and `y`

axis). And the thing about a
chaotic attractor is, the initial starting point (`x`

and _{0}`y`

) is largely immaterial (but not quite, as the
starting values must be between somewhere between 0 and 1). Any given value
for _{0}`x`

and _{0}`y`

will result in
the same figure being drawn for the given values of _{0}`A`

,
`B`

, `C`

and `D`

.

The program shown allows you to change the settings of the four
controlling variables `A`

, `B`

, `C`

and
`D`

by sliding the named boxes horizontally (the botton one
allows you to slide multiple variables at the same time—in this case,
`A`

and `C`

will slide to the left if you slide the
bottom box to the left).

That was fine, but my bosses at the time wanted another view generated—a “map” of the chaotic attractors, if you will. To generate this “map,” first you calculate how many points are generated before you settle into an attractor:

int mainloop(const double A,const double B,const double C,const double D) { double xn,xn1; double yn,yn1; int ix; int iy; int count; bool pix[DIM][DIM]; /* have we seen this (xy) pair yet? */ memset(pix,0,sizeof(pix)); for (count = 0 , xn = yn = 0.5 ; count < MAXLOOP ; count++) { xn1 = ((A * yn) + B) * xn * (1.0 - xn); yn1 = ((C * xn) + D) * yn * (1.0 - yn); xn = xn1; yn = yn1; /* exit if we're outside the bounds */ if (xn < 0.0) return MAXLOOP; if (xn >= 1.0) return MAXLOOP; if (yn < 0.0) return MAXLOOP; if (yn >= 1.0) return MAXLOOP; ix = (int)(xn * DIM); iy = (int)(yn * DIM); if (pix[ix][iy]) break; pix[ix][iy] = true; } return(count); }

The assumption—if this returns `MAXLOOP`

, there is no
attractor for the given set of values for `A`

, `B`

,
`C`

and `D`

; othersise it's an indication of how
quickly (or how “showey”) we settle into the pattern.

Then it's a matter of going through a range of values and for the map images, since they're two dimensional, I only sweep through two of the four governing variables over their range:

for (A = -4.0 ; A <= 4.0 ; A += 0.016) for (B = -4.0 ; B <= 4.0 ; B += 0.016) plot(A,B,count_as_color(mainloop(A,B,-0.8659,4.0));

(`count_as_color()`

just maps a count to a color)

And we end up with an image something like:

In this image, `A`

is the horizontal axis, going left to right
from -4.0 to 4.0, and `B`

is the veritcal axis, going, bottom to
top, from -4.0 to 4.0; `C`

and `D`

are fixed, at
-0.8659 and 4.0 respectively. The red plus in the upper right hand quadrant
is the location of the “French horn” attractor above. The black area
represents areas that have no chaotic attractor.

But my bosses didn't just want *one* map—no, they wanted a
larger set. So, I generated a series of images as this, each image having a
different value for `C`

. And while the map may not change
drastically—for instance, if we bump C up a notch to -0.8499:

The resulting chaotic attractor is completely different:

And I generated a ton of maps by looping through values of `C`

from -4.0 to 4.0—500 images in total.

Now, when I did this the first time, twenty years (or a bit more—maybe
up to twenty-two years ago) I was doing this on a state of the art Unix
workstation, an SGI workstation
(perhaps you've seen their commerical?). It cost
$30,000 and it took a *year* to generate the 500 images. Each
individual image took some *ten hours* to generate.

Fast forward to today. On a laptop that is maybe two years old now. I reran the code and was able to generate an entire image (in fact, both of the map images above) in only 8 seconds.

With four cores, that means, on a two year old laptop that might have
cost $1,500 (tops) would take *17 minutes* what it took me a
*year* to generate twenty years ago.

Oh, and about those eight seconds …

#### Update on August 5^{th}, 2013

Oops, I made a slight mistake …

#### Update on Tuesday, August 6^{th}, 2013

### About those eight seconds

In my previous post, I mentioned a program that took a year to run twenty-some years ago on a then state-of-the-art workstation could now be done in 17 minutes on a two year old laptop.

What I didn't mention is that the program twenty-some years ago was written in C and the program I ran today was written in Lua.

Yes, computers are *so fast* these days that a *scripting language* can
out-perform computers from two decades ago.

“Okay,” you say. “But I won't *want* to wait 17 minutes for my
data.”

Okay, fine. I see two options, and let's try the first option and one that most people would do—drop down to C. And yes, that does give us an improvement, an impressive improvement—only 2.5 seconds per frame, and across four cores that means you'll have the results in a little over five minutes. Not that bad.

The other option, and hear me out—is to take our Lua code and run it via LuaJIT, a (pretty much) drop in replacement for Lua that compiles down to native code. Even if it's a bit slower than C, it should still be faster than Lua with no code changes.

So how does LuaJIT fare?

Personally, I was expecting the C version (which I actually wrote first) to be faster, if only buy a little bit, but …

So, here's the C version (which generates a single image):

[spc]saltmine:~/source/play>time ./a.out >/dev/null real 0m2.483s user 0m2.470s sys 0m0.000s

And now the LuaJIT version:

[spc]saltmine:~/source/play>time luajit amap.lua >/dev/null real 0m0.849s user 0m0.840s sys 0m0.000s

And no, that's *not* a mistake (belive me, I checked and
rechecked)—here:

[spc]saltmine:~/source/play>time lua amap.lua >/dev/null real 0m8.091s user 0m8.060s sys 0m0.000s [spc]saltmine:~/source/play>time luajit amap.lua >/dev/null real 0m0.856s user 0m0.850s sys 0m0.000s

So … um … I can have that data to you in two minutes? Is that fast enough?

On reflection, it makes sense that LuaJIT will outperform C in this case. It's heavily CPU bound and the fact that the main function:

function mainloop(A,B,C,D) local pix = {} local xn = 0.5 local yn = 0.5 for count = 1 , MAX do local xn1 = ((A * yn) + B) * xn * (1.0 - xn) local yn1 = ((C * xn) + D) * yn * (1.0 - yn) xn = xn1 yn = yn1 if xn < 0 then return MAX-1 end if xn >= 1 then return MAX-1 end if yn < 0 then return MAX-1 end if yn >= 1 then return MAX-1 end local ix = math.floor(xn * DIM) local iy = math.floor(yn * DIM) local f = iy * DIM + ix -- Lua doesn't really do N-dimensional arrays if pix[f] then return count-1 end pix[f] = true end end

can be recompiled per call to take advantage of the paramters. For
instance, when `A`

and `B`

are both 0, the first
expression then becomes:

local xn1 = xn * (1.0 - xn)

and given that I'm doing 32,768 interations of this (oh, did I fail to mention that? Yes, I'm doing 27,768 more interations than the code did twenty-some years ago) this does save quite a bit of time.

#### Update on August 5^{th}, 2013

Oops, I made a slight mistake …