Friday, May 01, 2020
It seems that C's bit-fields are more of a pessimization than an optimization
A few days ago, maybe a few weeks ago, I don't know, the days are all just merging together into one long undifferentiated timey wimey blob but I'm digress, I had the odd thought that maybe, perhaps, I could make my Motorola 6809 emulator faster by using a bit-field for the condition codes instead of the individual booleans I'm using now. The thought was to get rid of the somewhat expensive routines to convert the flags to a byte value and back. I haven't used bit-fields all that much in 30 years of C programming as they tend to be implementation dependent:
- Whether a “plain”
int
bit-field is treated as asigned int
bit-field or as anunsigned int
bit-field (6.7.2, 6.7.2.1).- Allowable bit-field types other than
_Bool
,signed int
, andunsigned int
(6.7.2.1).- Whether a bit-field can straddle a storage-unit boundary (6.7.2.1).
- The order of allocation of bit-fields within a unit (6.7.2.1).
- The alignment of non-bit-field members of structures (6.7.2.1). This should present no problem unless binary data written by one implementation is read by another.
- The integer type compatible with each enumerated type (6.7.2.2).
C99 standard, annex J.3.9
But I could at least see how gcc
deals with them and see if there is indeed a performance increase.
I converted the definition of the condition codes from:
struct { bool e; bool f; bool h; bool i; bool n; bool z; bool v; bool c; } cc;
to
union { /*--------------------------------------------------- ; I determined this ordering of the bits empirically. ;----------------------------------------------------*/ struct { bool c : 1; bool v : 1; bool z : 1; bool n : 1; bool i : 1; bool h : 1; bool f : 1; bool e : 1; } f; mc6809byte__t b; }
(Yes,
by using a union I'm inviting “unspecified behavior”—from the C99 standard: “[t]he value of a union member other than the last one stored into (6.2.6.1)”),
but at least gcc
does the sane thing in this case.)
The code thus modified,
I ran some tests to see the speed up and the results were rather disappointing—it was slower using bit-fields than with 8 separate boolean values.
My guess is that the code used to set and check bits, especially in an expression like (cpu->cc.f.n == cpu->cc.f.v) && !cpu->cc.f.z
was larger
(and thus slower)
than just using plain bool
for each field.
So the upshot—by changing the code to use an implementation-defined detail and invoking unspecified behavior, thus making the resulting program less portable, I was able to slow the program down enough to see it wasn't worth the effort.
Perfect.