The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Tuesday, August 20, 2019

Profiles of Lua code

I started “Project: Sippy-Cup” six years ago as a “proof-of-concept” and it ended up in production (without my knowledge and to my dismay at the time) about a year or two afterwards. The code was written to be correct, not fast. And for the past four or five years its been running, performance has never been a real issue. But that's changing, as the projected traffic levels shoot past the “oh my” and into the “oh my God” territory.

“Project: Sippy-Cup” processes a lot of SIP messages, which are text based, so there's a lot of text processing. I use LPEG for ease of writing parsers, but it's not necessarily as fast as it could be.

There are two issues with LPEG—it has infinite look-ahead, and ordered choices. So the code that checks for the SIP method:

method = lpeg.P"ACK"
       + lpeg.P"BYE"
       + lpeg.P"CANCEL"
       + lpeg.P"INFO"
       + lpeg.P"INVITE"
       + lpeg.P"MESSAGE"
       + lpeg.P"NOTIFY"
       + lpeg.P"OPTIONS"
       + lpeg.P"PRACK"
       + lpeg.P"PUBLISH"
       + lpeg.P"REFER"
       + lpeg.P"REGISTER"
       + lpeg.P"SUBSCRIBE"
       + lpeg.P"UPDATE"
       + (lpeg.R("AZ","az","09") + lpeg.S"-.!%*_+`'~")^1

will first compare the input to “ACK”; if it doesn't match, it then backtracks and tries comparing the input to “BYE”, and so on down the list until it gets the last rule which is a “catch-all” rule. It would be easy to reorder the list so that the checks are “most-likely” to “least-likely,” but really the entire list could be removed leaving just the catch-all:

method = (lpeg.R("AZ","az","09") + lpeg.S"-.!%*_+`'~")^1

I have the same issue with SIP headers—there are 100 headers that are “parsed” (for various values of “parsed”) but I only really look at a dozen headers—the rest just slow things down and can be passed by a generic parsing rule. The full headers were added during the “proof-of-concept” stage since I wasn't sure at the time which headers would be critical and which ones wouldn't, and I've never gone back and cleaned up the code.

Another aspect is the sheer number of validity checks the code makes on the incoming SIP message. Many of the checks don't really have any effect on the processing due to managerial mandate at the time, so they could go (I wanted strict checking that bailed on any error; my manager at the time did not want such strictness—no need to guess who won, but I still track parsing irregularities).

So while I feel these are two areas where the code could be made faster, I don't know if that's where the time is spent, and so it's time to profile the code.

The issue now is that the system profiler will profile the code as C, not as Lua. I don't need to profile the code to know the Lua VM gets called all the time. What I need to know is what Lua code is called all the time. But it can't hurt to try the system profiler, right? And given that the regression test has over 12,000 test cases, we should get some good profiling information, right?

Original Profile—Each sample counts as 0.01 seconds.
% time cumulative seconds self seconds calls self ms/call total ms/call name
13.32 3.47 3.47       match
12.74 6.79 3.32       luaV_execute
9.31 9.22 2.43       luaS_newlstr
6.83 11.00 1.78       luaD_precall
5.31 12.38 1.39       luaH_getstr
3.38 13.26 0.88       luaD_poscall
2.57 13.93 0.67       index2adr
2.19 14.50 0.57       luaV_gettable

Not bad at all. The function match() is the LPEG execution engine, which matches my intial thoughts on the code. It wasn't much work to remove extraneous SIP headers I don't bother with, and to simplify the method parsing (see above). Re-profile the code and:

Modified LPEG—Each sample counts as 0.01 seconds.
% time cumulative seconds self seconds calls self ms/call total ms/call name
14.25 3.67 3.67       luaV_execute
11.22 6.56 2.89       luaS_newlstr
10.49 9.26 2.70       match
6.33 10.89 1.63       luaD_precall
5.20 12.23 1.34       luaH_getstr
2.76 12.94 0.71       index2adr
2.58 13.61 0.67       luaD_poscall
2.41 14.23 0.62       luaV_gettable

match() drops from first to third place, so that's good. And a load test done by the QA engineer showed an easy 25% increase is message processing.

But that's really as far as I can go with profiling. I did a run where I removed most of the validation checks (but only after I saw none of them were triggered over the past 30 days) and didn't see much of a speed improvement. So I really need to profile the Lua code as Lua code and not as C.

That's going to take some work.

Obligatory Picture

[“I am NOT a number, I am … a Q-CODE!”]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site:, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.