Sunday, August 04, 2013
About those eight seconds
In my previous post, I mentioned a program that took a year to run twenty-some years ago on a then state-of-the-art workstation could now be done in 17 minutes on a two year old laptop.
What I didn't mention is that the program twenty-some years ago was written in C and the program I ran today was written in Lua.
Yes, computers are so fast these days that a scripting language can out-perform computers from two decades ago.
“Okay,” you say. “But I won't want to wait 17 minutes for my data.”
Okay, fine. I see two options, and let's try the first option and one that most people would do—drop down to C. And yes, that does give us an improvement, an impressive improvement—only 2.5 seconds per frame, and across four cores that means you'll have the results in a little over five minutes. Not that bad.
The other option, and hear me out—is to take our Lua code and run it via LuaJIT, a (pretty much) drop in replacement for Lua that compiles down to native code. Even if it's a bit slower than C, it should still be faster than Lua with no code changes.
So how does LuaJIT fare?
Personally, I was expecting the C version (which I actually wrote first) to be faster, if only buy a little bit, but …
So, here's the C version (which generates a single image):
[spc]saltmine:~/source/play>time ./a.out >/dev/null real 0m2.483s user 0m2.470s sys 0m0.000s
And now the LuaJIT version:
[spc]saltmine:~/source/play>time luajit amap.lua >/dev/null real 0m0.849s user 0m0.840s sys 0m0.000s
![Yeah, that's what I did when I saw these results [Yeah, that's what I did when I saw these results]](/2013/08/04/jawdrop.jpg)
And no, that's not a mistake (belive me, I checked and rechecked)—here:
[spc]saltmine:~/source/play>time lua amap.lua >/dev/null real 0m8.091s user 0m8.060s sys 0m0.000s [spc]saltmine:~/source/play>time luajit amap.lua >/dev/null real 0m0.856s user 0m0.850s sys 0m0.000s
So … um … I can have that data to you in two minutes? Is that fast enough?
On reflection, it makes sense that LuaJIT will outperform C in this case. It's heavily CPU bound and the fact that the main function:
function mainloop(A,B,C,D) local pix = {} local xn = 0.5 local yn = 0.5 for count = 1 , MAX do local xn1 = ((A * yn) + B) * xn * (1.0 - xn) local yn1 = ((C * xn) + D) * yn * (1.0 - yn) xn = xn1 yn = yn1 if xn < 0 then return MAX-1 end if xn >= 1 then return MAX-1 end if yn < 0 then return MAX-1 end if yn >= 1 then return MAX-1 end local ix = math.floor(xn * DIM) local iy = math.floor(yn * DIM) local f = iy * DIM + ix -- Lua doesn't really do N-dimensional arrays if pix[f] then return count-1 end pix[f] = true end end
can be recompiled per call to take advantage of the paramters. For
instance, when A
and B
are both 0, the first
expression then becomes:
local xn1 = xn * (1.0 - xn)
and given that I'm doing 32,768 interations of this (oh, did I fail to mention that? Yes, I'm doing 27,768 more interations than the code did twenty-some years ago) this does save quite a bit of time.
Update on August 5th, 2013
Oops, I made a slight mistake …