Thursday, November 29, 2007
More stupid multithreaded benchmarks
The other interesting thing to think about: at what point do we beat the same algorithm in C, and how hard would it be to parallelise the algorithm in C with
pthreads
…
Use those extra cores and beat C today! (Parallel Haskell redux)
Since I'm doing stupid benchmarks
anyway, why not this? I didn't use pthreads
though—I haven't
used that particular API in about a decade, and even then, I wasn't thrilled
with it.
Nope. Instead I used a Linux specific call—clone()
, wrote a
small function to wait for all the threads that were created, and basically
spent about five minutes making a
parallelized C-version of the Fibonacci sequence.
Granted, I had to do the parallelization explicitly, but it wasn't that
difficult to do.
The hard part was finding a quad-core box to test this on. Fortunately, I know we have one at The Company (shhh—don't tell Smirk) and that's where I spent most of my time on this—locating our single quad-core machine I could test this on.
But find it I did. And yes, the program does run faster the more cores that are thrown at it, but it's not a linear speed up:
# cores | Runtime |
---|---|
1 | 0m31.468s |
2 | 0m19.521s |
4 | 0m17.234s |
It's interesting that it appears to be bottoming out rather quickly (and these are average times over a few runs).