Monday, May 27, 2024
How does TLS use less CPU than plain TCP?
I have two services written in Lua—a gopher server and a Gemini server. The both roughly serve the same data (mainly my blog) and yet, the gopher server accumulates more CPU time than the Gemini server, despite that the Gemini server uses TLS and serves more requests. And not by a little bit either:
gopher | 17:26 |
Gemini | 0:45 |
So I started investigating the issue.
It wasn't TCP_NODELAY
(via Lobsters)
as latency wasn't the issue
(but I disabled Nagle's algorithm anyway).
Looking further into the issue, it seemed to be one of buffering. the code was not buffering any data with TCP; furthermore, the code was issuing tons of small writes. My thinking here was—Of course! The TCP code was making tons of system calls, whereas the TLS code (thanks to the library I'm using) must be doing buffering for me.
So I added buffering to the gopher server, and now, after about 12 hours (where I restarted both servers) I have:
gopher | 2:25 |
Gemini | 2:13 |
I … I don't know what to make of this. Obviously, things have improved for gopher, but did I somehow make Gemini worse? (I did change some low level code that both TCP and TLS use; I use full buffering for TCP, no buffering for TLS). Is the load more evenly spread?
It's clear that gopher is still accumulating more CPU time, just not as bad as it was. Perhaps more buffering is required? I'll leave this for a few days and see what happens.