Monday, June 03, 2024
Just a simple matter of replacing a slow Lua function with a faster C function
I spent the past few days rewriting some Lua code into C. While I find LPEG to be convenient, it is not necessarily fast. Normally this isn't an issue but in this case, I was calling LPEG for each character in a blog post.
Fortunately,
it was fairly straight forward porting the code to C.
The code goes through text a character codepoint at a time.
If it's a whitespace character or a hyphen,
I mark the current position as a possible breakpoint for the text;
otherwise I ignore combining characters
(they don't count towards the line length).
Then,
when I reach past the number of characters I want for a line,
I copy out the string from the beginning of the “line” to the marked breakpoint
(and if there isn't a breakpoint,
there is no good place to break the line so I will break the line at the line length—not much else to do),
then mark the beginning of the next line and continue until the end of the text.
The hardest part was figuring out how to classify each character I needed. In the end, I pull out each Unicode codepoint from UTF-8 and look through an array to classify the codepoint as whitespace, a hyphen or a combining character; if they aren't in the table, it just a normal character.
As a sanity check, I reran the original profiling test:
gopher (original) | 457035 |
gopher (new) | 18246 |
gemini (just because) | 22661 |
Much better.
And most of the 457,035 lines of code being executed are now hidden behind C.
Now to make sure the code is actually faster,
I profiled the new wrapt()
function:
local wraptx = wrapt local function wrapt(...) local start = rdtsc() local res = wraptx(...) local stop = rdtsc() syslog('notice',"wrapt()=%d",stop-start) return res end
with the decently sized request I used before (each line is a call to wrapt()
):
#Lua code | C code |
43330 | 11810 |
43440 | 12000 |
45300 | 12220 |
48100 | 12020 |
48680 | 13690 |
49260 | 12650 |
54140 | 12270 |
54650 | 12460 |
58530 | 12130 |
59760 | 14180 |
61100 | 15480 |
65440 | 14970 |
67920 | 15810 |
68750 | 15310 |
69920 | 17170 |
69960 | 17780 |
70740 | 16510 |
75640 | 16750 |
78870 | 19170 |
83200 | 18190 |
87090 | 17290 |
89070 | 23360 |
91440 | 19560 |
101800 | 21520 |
102460 | 21060 |
103790 | 22180 |
106000 | 22400 |
106010 | 21870 |
112960 | 21160 |
115300 | 21870 |
115980 | 23130 |
118690 | 24980 |
122550 | 23960 |
122710 | 24550 |
127610 | 23830 |
129580 | 24670 |
130120 | 24930 |
140580 | 26570 |
141930 | 25210 |
157640 | 27050 |
168000 | 32250 |
Excellent! The new code is three to five times faster. Now to just sit back and see how the new code fares over the next few days.