The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Monday, June 03, 2024

Just a simple matter of replacing a slow Lua function with a faster C function

I spent the past few days rewriting some Lua code into C. While I find LPEG to be convenient, it is not necessarily fast. Normally this isn't an issue but in this case, I was calling LPEG for each character in a blog post.

Fortunately, it was fairly straight forward porting the code to C. The code goes through text a character codepoint at a time. If it's a whitespace character or a hyphen, I mark the current position as a possible breakpoint for the text; otherwise I ignore combining characters (they don't count towards the line length). Then, when I reach past the number of characters I want for a line, I copy out the string from the beginning of the “line” to the marked breakpoint (and if there isn't a breakpoint, there is no good place to break the line so I will break the line at the line length—not much else to do), then mark the beginning of the next line and continue until the end of the text.

The hardest part was figuring out how to classify each character I needed. In the end, I pull out each Unicode codepoint from UTF-8 and look through an array to classify the codepoint as whitespace, a hyphen or a combining character; if they aren't in the table, it just a normal character.

As a sanity check, I reran the original profiling test:

Lines of Lua code executed to serve a request
gopher (original) 457035
gopher (new) 18246
gemini (just because) 22661

Much better. And most of the 457,035 lines of code being executed are now hidden behind C. Now to make sure the code is actually faster, I profiled the new wrapt() function:

local wraptx = wrapt
local function wrapt(...)
  local start = rdtsc()
  local res   = wraptx(...)
  local stop  = rdtsc()
  syslog('notice',"wrapt()=%d",stop-start)
  return res
end

with the decently sized request I used before (each line is a call to wrapt()):

Runtime (lower is better)
#Lua code C code
43330 11810
43440 12000
45300 12220
48100 12020
48680 13690
49260 12650
54140 12270
54650 12460
58530 12130
59760 14180
61100 15480
65440 14970
67920 15810
68750 15310
69920 17170
69960 17780
70740 16510
75640 16750
78870 19170
83200 18190
87090 17290
89070 23360
91440 19560
101800 21520
102460 21060
103790 22180
106000 22400
106010 21870
112960 21160
115300 21870
115980 23130
118690 24980
122550 23960
122710 24550
127610 23830
129580 24670
130120 24930
140580 26570
141930 25210
157640 27050
168000 32250

Excellent! The new code is three to five times faster. Now to just sit back and see how the new code fares over the next few days.

Obligatory Picture

Dad was resigned to the fact that I was, indeed, a landlubber, and turned the boat around yet again …

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

No AI was used in the making of this site, unless otherwise noted.

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2025 by Sean Conner. All Rights Reserved.