Monday, September 07, 2015
Some more usless µbenchmarks checking for integer overflow
INTO instruction to check for overflow was dog slow,
so what about using
Will that be slow?
The results speak for themselves (reminder—the expressions are compiled and run 1,000,000 times):
Even though the code using the
JO instruction is longer than either version:
xor eax,eax mov ax,0x1 add ax,1 jo error add ax,1 jo error add ax,1 jo error add ax,1 jo error add ax,1 jo error imul 100 jo error mov bx,13 cwd idiv bx jo error mov [$0804F58E],ax ret error: into ret
it performed about the same as the non-overflow checking version.
That's probably due to the branch prediction having very little overhead on performance.
One thing to notice,
is that were a compiler to go down this path and check explicitely for overflow,
not only would the code be larger,
but overall it might be a bit slower than normal as there are commonly used optimizations
(at least on the x86 architecture)
that cannot be used.
a cheap way to multiply a value by 5 is to skip the
IMUL instruction and instead do
LEA EAX,[EAX*4 + EAX],
LEA does not set the overflow flag.
INC EAX in a row is smaller (and just as fast) as doing
but while the
INC instruction does set the overflow flag,
you have to check the flag after each
INC or you could miss an actual overflow,
which defeats the purpose of using
INC to generate smaller code.
And one more thing before I go,
and this is about DynASM—it's not stated anywhere,
but if you use local labels,
you have to call
or else the program will crash.
I found this out the hard way.