Wednesday, January 18, 2006
LaBrea ad-naseum
Just because I was curious, I ran the Perl script that parses the LaBrea logs on the 1.1G logfile from the other day. I knew it would take at least four hours or so to run.
I then created a modifed version of ltpstat
to parse this
file (it's the output from syslogd
, hense some slight parsing
modifications were required) and ran it at the same time.
What I found interesting (during the actual run of both programs) is that my program had a larger virtual memory footprint over the Perl version (easily seven to eight times larger) but the resident set size (the amount of physical memory being used—the rest not physically allocated or shoved off into swap space) of my program was half that of the Perl script. In retrospect this was expected—Perl was growing the data as it was being generated whereas my program allocates the whole thing at once, but my program has less overhead in keeping track of said data.
And ltpstat
is faster than the Perl script even if it isn't
gathering the stats in real-time— 3 hours, 22 minutes and 24 seconds to
run vs. 6 hours, 39 minutes 49 seconds—almost half the time. I didn't
see how much memory the Perl script was using just prior to finishing, but I
can't see how it would be less than ltpstat
.
The instance of ltpstat
I started yesterday is still running:
Start: Tue Jan 17 14:55:59 2006 End: Wed Jan 18 14:55:59 2006 Running time: 1d Pool-max: 1048576 Pool-num: 388929 Rec-max: 1048576 Rec-num: 388929 UIP-max: 1048576 UIP-num: 20298 Reported-bandwidth: 64 (Kb/sec)
Looks like I may break a million connections sooner than I expected.