The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Saturday, September 05, 2015

Of course it's slower, but I didn't expect it to be quite that bad

Time for another useless µbenchmark! This time, the overhead of trapping integer overflow!

So, inspired by this post about trapping integer overflow, I thought it might be interesting to see how bad the overhead is of using the x86 instruction INTO to catch integer overflow. To do this, I'm using DynASM to generate code from an expression that uses INTO after every operation. There are other ways of doing this, but the simplist way is to use INTO. I'm also using 16-bit operations, as the numbers involved (between -32,768 and 32,767) are reasonable (for a human) to deal with (unlike the 32-bit range -2,147,483,648 to 2147483647 or the insane 64-bit range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807).

The one surprising result was that Linux treats the INTO trap as a segfault! Even requesting additional information (passing the SA_SIGINFO flag with sigaction()) doesn't tell you anything. But that in itself tells you it's not a real segfault, as a real segfault will report a memory mapping error. Personally, I would have expected a floating point fault, even though it's not a floating point operation, because on Linux, integer division by 0 results in floating point fault (and oddly enough, a floating point division by 0 results in ∞ but no fault)!

But, aside from that, some results. I basically run the expression one million times and simply record how long it takes. The first is just setting a variable to a fixed value (and the “- 0” bit is there just to ensure an overflow check is included):

x = 1 - 0
overflowtimeexpression result
true0.0090800001
false0.0068200001

Okay, not terribly bad. But how about a longer expression? (and remember, the expresssion isn't optimized)

x = 1 + 1 + 1 + 1 + 1 + 1 * 100 / 13
overflowtimeexpression result
true0.07952800046
false0.03012500046

Yikes! (But this is also including the function call overhead). For the curious, the last example compiled down to:

	xor	eax,eax
	mov	ax,1
	add	ax,1
	into
	add	ax,1
	into
	add	ax,1
	into
	add	ax,1
	into
	add	ax,1
	into
	imul	100
	into
	mov	bx,13
	cwd
	idiv	bx
	into
	mov	[$0804f50E],ax
	ret

The non-overflow version just had the INTO instructions missing—otherwise it was the same code.

I think what's surprising the most here is that the INTO instruction just checks the overflow flag and only if set does it cause a trap. The timings I have (and I'll admit, the figures I have are old and for the 80486) show that INTO only has a three-cycle overhead if not taken. I'm guessing things are worse with the newer multipipelined multiscalar multiprocessor monstrosities we use these days.

Next I'll have to try using the JO instruction and see how well that fares.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.