Wednesday, October 14, 2009
A mass of stupid benchmarks. You have been warned.
A few weeks ago Wlofie and I were talking about computer languages and I mentioned the language I wrote in college (it was more than the simple DSL shown there), so I cleaned up the code so it would compile (most of it was written in 1993; last date of modification was March 5th, 1995) and just because I was curious, I decided to run a stupid benchmark on it.
What's more stupid than just adding up the first 1,000,000 integers?
: main 0 0 begin swap over + swap 1 + dup 1000000 < while repeat drop "%d#n" string print ; main 0 exit
(Oh, did I mention the language was based off Forth?)
Running that code on my modern 2.6GHz machine takes a full 2.1 seconds (and that's after I removed a huge bottle neck in the runtime engine). The equivalent C code:
#include <stdio.h> #include <stdlib.h> int main(int argc,char *argv[]) { int i; int total; for (i = 0 , total = 0 ; i < 1000000 ; i++) total += i; printf("%d\n",total); return EXIT_SUCCESS; }
takes a mere 0.004 seconds to run.
Okay, that's expected—it's C. But what about Perl? 0.47 seconds.
Heh. A hundred times slower.
Hey! What about Python? It's not as fast as Perl, but it comes in at around 0.58 seconds.
I'm not even going to attempt a comparrison between my language and Ruby—it'd be too depressing to lose out to a known slow language.
But Wlofie did mention one other language he was playing around with—Lua. Okay, how fast is Lua? Same benchmark as above: 0.15 seconds.
Wow.
It blew Perl right out of the water.
This required a bit more investigation. And a better test than just summing up the first 1,000,000 integers. I settled upon the Jumble program. It involves I/O and has one non-trivial operation (sorting the letters in a word). And on my current system, the C version can check 483,523 words in 0.18 seconds (the timings I gave back in 2008 were for my older, 120MHz development machine).
The Lua code is straightforward:
#!/usr/local/bin/lua WORDS = "/usr/share/dict/words" -- ****************************************************** function sortword(word) local t = {} for i=1,string.len(word) do t[i] = string.byte(word,i) end table.sort(t) return string.char(unpack(t)) end -- ******************************************************** if #arg < 1 then io.stderr:write("usage: %s word\n",arg[0]) os.exit(1) end word = sortword(string.upper(arg[1])) dict = io.open(WORDS,"r") if dict == nil then io.stderr:write("Houston, we have a problem ... \n") os.exit(1) end for line in dict:lines() do dictsort = sortword(string.upper(line)) if dictsort == word then print(line) end end dict:close() os.exit(0)
sortword()
has quite a bit of work to do—it has to go
through the string, placing each letter into an array, sort the array, and
then convert the array back into a string. Unlike C, which treats strings
as an array of characters (and thus can be sorted without the data
gymnastics), Lua (and pretty much all other modern scripting languages)
treat strings at a single unit.
So it takes 7.9 seconds for Lua to run through the same 483,523 words.
Ouch. And I think it's pretty obvious where the time is being spent—in
sortword()
, pulling apart strings, sorting arrays, making new
strings. But Lua can be easily extended using C. I rewrite
sortword()
in C:
#include <stdlib.h> #include <string.h> #include <lua.h> #include <lauxlib.h> #include <lualib.h> /*******************************************/ static int cmp(const void *l,const void *r) { int cl; int cr; cl = *(const char *)l; cr = *(const char *)r; return cl - cr; } /************************************/ static int sortword(lua_State *L) { const char *word; size_t len; word = luaL_checkstring(L,1); len = strlen(word); char buffer[len]; memcpy(buffer,word,len); buffer[len] = '\0'; qsort(buffer,len,1,cmp); lua_pushlstring(L,buffer,len); return 1; } /***********************************/ int luaopen_sortword(lua_State *L) { lua_register(L,"sortword",sortword); return 0; }
And now it only takes 2.2 seconds. Not a bad return upon investment.
Now we turn to Perl:
#!/usr/bin/perl use strict; #*************************** sub sortword { my $str = shift; my @t = unpack("(C)*",$str); @t = sort(@t); return pack("(C)*",@t); } #******************************* if (scalar(@ARGV) == 0) { printf(STDERR "usage: %s word\n",$0); exit(1); } my $word = sortword(uc($ARGV[0])); open(DICT,"/usr/share/dict/words") or die("Houston, we have a problem ... \n"); while(my $line = <DICT>) { chop $line; my $dictsort = sortword(uc($line)); if ($dictsort eq $word) { print $line . "\n"; } } close(DICT); exit(0);
Again, we need to pull apart the string into an array, sort that, and recreate the string. And when we run this lovely bit of Perl code it only takes 10.5 seconds.
Wow.
And here I thought Perl was the text crunching Master of the Universe.
Guess not.
Well, the Perl version of sortword()
could be tightened up a
bit …
sub sortword { return pack("(C)*",sort(unpack("(C)*",shift))); }
That takes us down to 8.6 seconds—nearly two full seconds to shuffle
data between variables. Perl isn't looking all that good. Well, there is
the function overhead, and given that we got sortword()
down to
one line, maybe if we inline it we'll get an improvement.
And we do—it's not 7.2 seconds. We finally beat unoptimized Lua code, but now with a more cryptic Perl version.
Way to go, Perl!
But given that Lua was easy to extend with C, can we do the same with Perl? A bit of searching lead me to the Inline::C Perl Module. Ten minutes to download and install (much better than I expected) and if anything, it was easier to embed C in Perl than in Lua:
#!/usr/bin/perl use Inline C; use strict; # same code as above, minus the sortword() subroutine exit(0); __END__ __C__ static int cmp(const void *l,const void *r) { int cl; int cr; cl = *(const char *)l; cr = *(const char *)r; return cl - cr; } char *sortword(char *word) { size_t len; len = strlen(word); qsort(word,len,1,cmp); return word; }
And we're down to 2.1 seconds. Just a bit quicker than the Lua/C version.
Well, that's after the 4.1 second initial run where it dumped a bunch of directories and files (lovely that).
Lua appears to be a bit faster than Perl, or at the very least, as fast as Perl.
And everything (even, quite possibly, Ruby) is faster than my own homebrewed language … sigh.