Wednesday, August 21, 2019
“Nobody Expects the Surprising Profile Results!”
It still surprises me that the results of profiling can be so surprising.
Today I profiled Lua code as Lua. It was less work than expected and all it took was about 30 lines of Lua code. For now, I'm just recording the file name, function name (if available—not all Lua functions have names) and the line number as that's all that's really needed.
But as I was writing the code to profile the code, I wasn't expecting any real results from profiling “Project: Sippy-Cup.” The code is really just:
- get packet
- parse packet
- validate SIP message
- acknowledge SIP message
- get relevent data from SIP message
- query “Project: Lumbergh” (business logic)
- wait for results
- send results in SIP message
- wait for SIP acknowledgement
- done
I was expecting a fairly uniform profile result, and if pressed, maybe a blip for awaiting results from “Project: Lumbergh” as that could take a bit. What I did not expect was this:
count | file/function/line |
---|---|
21755 | @third_party/LPeg-Parsers/ip-text.lua::44 |
6000 | @XXXXXXXXXXXXXXXXXXXXXXXXXX:send_query:339 |
2409 | @XXXXXXXXXXXXXXXXXXXX:XXXXXXXXX:128 |
After that,
the results tend to flatten out.
And yes, the send_query()
makes sense,
but ip-text.lua
?
Three times more than the #2 spot?
This line of code?
local n = tonumber(capture,16)
That's the hot spot? Wait? I'm using IPv6 for the regression test? When did that happen? Wait? I'm surprised by that as well? What is going on here?
Okay, breathe.
Okay.
I decide to do another run, this time at a finer grain, about 1/10 the previous profiling interval and see what happens.
count | file/function/line |
---|---|
133186 | @third_party/LPeg-Parsers/ip-text.lua::44 |
29683 | @third_party/LPeg-Parsers/ip-text.lua::46 |
21910 | @third_party/LPeg-Parsers/ip-text.lua::45 |
19749 | @XXXXXXXXXXXXXXXXXXXXXXXXXXX:XXXXXXXXXXXXX:279 |
And the results flatten out after that. So the hot spot of “Project: Sippy-Cup” appears to be this bit of code:
local h16 = Cmt(HEXDIG^1,function(_,position,capture) local n = tonumber(capture,16) if n < 65536 then return position end end)
send_query()
doesn't show up until the 26TH spot,
but since it's finer grained,
it does show up multiple times,
just at different lines.
So … yeah.
I have to think on this.
Done with the profiling for now
After some more profiling work I've come to the conclusion that yes,
the hot spot is in ip-text.lua
,
and that after that function, it's quite flat otherwise.
The difference between ip-text.lua
and the number two spot isn't quite as bad as I initially thought,
although it took some post-processing to lump all the function calls together to determine that
(required because Lua can't always know the “name” of a function,
but with the line numbers they can be reconciled).
It's only called about twice as much as the next most used function instead of the nearly 4½ times it appeared earlier.
As far as profiling “Project: Sippy-Cup” is concerned, I think I'm about as far as I can go at this time. I did improve the performance with some minor code changes and any more improvement will take significant resources. So I'm calling it good enough for now.