The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Sunday, October 26, 2014

The Painless Guide to CRC isn't quite painless

So there's this “A Painless Guide to CRC Error Detection Algorithms,” which apparently tells you everything you wanted to know about CRCs, but were afraid to ask (or didn't really want to ask). The concept itself seems to be easy—a CRC is just the remainder of a particular type of division. The numerator is the data (treated as one long number) and the demoninator the “polynomial” of the CRC (even though it's a value, the specification for a given CRC is a polynomial equation—go figure). The steps of the algorithm are very simple:

1. Load the remainder with zero bits.
2. Augment the message by appending zero bits (equal to the size of the remainder) to the end of it.
3. While (more message bits)
1. Shift the remainder left by one bit, reading the next bit of the augmented message into bit position 0 of the remainder.
2. If a 1 bit popped out of the remainder during the previous step, XOR the result with the polynomial
4. We now have the remainder.

The size of the remainder is the size of our CRC—16 for a 16-bit CRC, 32 for a 32-bit CRC, etc. And from that description, the code pretty much follows:

```/**********************************************************
* Straightforward CRC implementation
* Section 8 of the Guide
***********************************************************/

uint32_t crcsim(uint32_t crc,const uint8_t *p,size_t size)
{
size_t   i;
int      c;
int      xor;

while(size--)
{
c = *p++;
for (i = 0 ; i < 8 ; i++)
{
xor = crc & 0x80000000uL;  /* flag if we need to xor the polynomial */
crc = crc << 1;            /* shift the crc register */
crc |= (c & 0x80) ? 1 : 0; /* and shift in the data bit */

if (xor)
crc ^= 0x04C11DB7uL;

c = c << 1; /* shift data bit */
}
}

return crc;
}
```

Although, this is the rare case where it's easier to write it in assembly that it is in C, since we have access to the carry bit when shifting, which makes it easier to check:

```crc32a          push    ebp		; boiler plate code for any
mov     ebp,esp		; C callable code
push    ebx
push    esi
push    edi

mov     edx,[ebp + 8]   ; load passed in CRC
mov     esi,[ebp + 12]  ; ptr to data
mov     ecx,[ebp + 16]  ; count of data
mov     edi,0x04C11DB7  ; CRC polynomial

.main           lodsb                   ; read next data byte
mov     bl,8            ; # of bits to go through

.10             shl     al,1            ; shift data bit into poly register
rcl     edx,1           ; shift high bit out of poly register
jnc     .15             ; if it wasn't set, skip
xor     edx,edi         ; xoring the CRC polynomial

.15             dec     bl              ; more bits?
jnz     .10             ; if so, keep going
loop    .main           ; do next byte

mov     eax,edx         ; return CRC
pop     edi             ; save pushed registers
pop     esi
pop     ebx
pop     ebp
ret
```

(Note: the core of the algorithm in assembly is four instructions---the C compiler didn't do quite as good a job from the six lines of C comprising the core of the algorithm---it's about six times the object code.)

What is not shown in the code above (either version) is the agumentation step of adding additional 0-bits to the message—that's left up to the caller of these routines.

Both of these routines give the same result. Other implementations I did based upon the Guide also give the same results. And they're consistent with the results of the reference code given in the Guide.

```/*******************************************************************
* Table implementation
* Section 9 of the Guide
* Requires trailing zero bits.
********************************************************************/

const uint32_t crctable[256] = { ... };

uint32_t crc32z(uint32_t crc,const uint8_t *p,size_t size)
{
while(size--)
crc = ((crc << 8) | *p++) ^ crctable[crc >> 24];

crc = (crc << 8) ^ crctable[crc >> 24];
crc = (crc << 8) ^ crctable[crc >> 24];
crc = (crc << 8) ^ crctable[crc >> 24];
crc = (crc << 8) ^ crctable[crc >> 24];
return crc;
}

/*****************************************************************
* Table implementation part deux---improved
* Section 10 of the Guide
* Does not need trailing zero bits.
******************************************************************/

uint32_t crc32c(uint32_t crc,const uint8_t *p,size_t size)
{
while(size--)
crc = (crc << 8) ^ crctable[ (crc >> 24) ^ *p++];

return crc;
}

/*****************************************************************
* Parameterized Model, from code given in the Guide
* Section 15 of the Guide
*
* This is a reference implementation provided by the Guide to be
* used for testing various CRC implementations.
******************************************************************/

uint32_t crcmod(uint32_t crc,const uint8_t *p,size_t size)
{
cm_t ctx;

ctx.cm_width = 32;
ctx.cm_poly  = 0x04C11DB7uL;
ctx.cm_init  = crc;
ctx.cm_refin = false;
ctx.cm_refot = false;
ctx.cm_xorot = 0;

cm_ini(&ctx);
cm_blk(&ctx,(p_ubyte_)p,(ulong)size);
return (uint32_t)cm_crc(&ctx);
}
```
CRC of “123456789” using different implementations
ImplementationCRC result
`crcsim()` `89A1897F`
`crc32z()` `89A1897F`
`crc32c()` `89A1897F`
`crcmod()` `89A1897F`

So far so good. But that isn't the result from the standard CRC-32 implementation, which is used by Ethernet, ZIP, gzip, PNG and a few other standards. No, CRC-32 uses what the Guide calls a “reflected” table mode, which came about because of hardware CRC-implementations start with the least significant bit of the byte; these algorithms start with the most significant bit of the byte.

Okay, so the bits are fed in backwards. That can be compensated for. Also, the standard CRC-32 algorithm mandates that the initial value of the remainder is all one bits, not zero bits. Easy to fix. And that the final remainder is to be exclusived-or'ed with all ones. Again, easy to do.

It all seems pretty straightforward. And while the Guide only goes over a table inplementation of the “reflected” mode, it seems like it would be straightforward (excuse the pun) to do reflected versions of all the implementations done so far.

And since the `zlib` library uses the CRC-32, we can link that in as a baseline to compare results.

So, with that out of the way, the code:

```/**********************************************************
* Straightforward CRC implementation, using reflected bytes
* based on Section 9 of the Guide
***********************************************************/

uint32_t crcsimr(uint32_t crc,const uint8_t *p,size_t size)
{
size_t   i;
int      c;
int      xor;

crc = ~crc;
while(size--)
{
c = *p++;
for (i = 0 ; i < 8 ; i++)
{
xor = crc & 0x80000000uL;
crc = crc << 1;
crc |= (c & 0x01) ? 1 : 0;

if (xor)
crc ^= 0x04C11DB7uL;

c = c >> 1;
}
}

return crc;
}

/*******************************************************************
* Table implementation, using a reflected table
* based on Section 9 of the Guide
********************************************************************/

const uint32_t crctabler[256] = { ... };

uint32_t crc32rz(uint32_t crc,const uint8_t *p,size_t size)
{
crc = ~crc;

while(size--)
crc = ((crc >> 8) | *p++) ^ crctabler[ crc & 0xFF ];

crc = (crc >> 8) ^ crctabler[ crc & 0xFF ];
crc = (crc >> 8) ^ crctabler[ crc & 0xFF ];
crc = (crc >> 8) ^ crctabler[ crc & 0xFF ];
crc = (crc >> 8) ^ crctabler[ crc & 0xFF ];

return ~crc;
}

/*****************************************************************
* Table implementation part deux---improved using reflected table
* based on Section 10 of the Guide
******************************************************************/

uint32_t crc32r(uint32_t crc,const uint8_t *p,size_t size)
{
crc = ~crc;

while(size--)
crc = (crc >> 8) ^ crctabler[ (crc & 0xFF) ^ *p++];

return ~crc;
}

/*****************************************************************
* Parameterized Model, from code given in the Guide
* Section 15 of the Guide
*
* This is a reference implementation provided by the Guide to be
* used for testing various CRC implementations.
******************************************************************/

uint32_t crcmodr(uint32_t crc,const uint8_t *p,size_t size)
{
cm_t ctx;

ctx.cm_width = 32;
ctx.cm_poly  = 0x04C11DB7uL;
ctx.cm_init  = ~crc;
ctx.cm_refin = true;
ctx.cm_refot = true;
ctx.cm_xorot = ~0;

cm_ini(&ctx);
cm_blk(&ctx,(p_ubyte_)p,(ulong)size);
return (uint32_t)cm_crc(&ctx);
}
```

And the results:

CRC of “123456789” using different implemenations, reflected
ImplementationCRC result
`crcsimr()` `AF296EBB`
`crc32rz()` `717C74D2`
`crc32r()` `CBF43926`
`crcmodr()` `CBF43926`
`zlib.crc32()` `CBF43926`

… um … that was rather unexpected.

I didn't think the code I wrote for reflected CRCs was that unreasonable based upon the information in the Guide, but I guess I was wrong for some of them.

Oh, and getting back to the non-reflected code—I didn't initialize the results properly, nor did I exclusive-or the results. Hopefully, I'll get `CBF43926` when I do that.

CRC of “123456789” using different implementations, non-reflected with proper initialization
ImplementationCRC result
`crcsim()` `C8C3A78F`
`crc32z()` `C8C3A78F`
`crc32c()` `FC891918`
`crcmod()` `FC891918`

Okay, now I'm horribly confused. There appears to be some missing information in “A Painless Guide to CRC Error Detection Algorithms.” The GNU Radio implementation of CRC-32 uses the non-reflective table implementation, and when I called that, I got back `FC891918`, so it's consistent with at least two of the CRC-32 non-reflected implementations. But I'm concerned that the routines that require additional zero bits aren't the same in this case. There has to be some subtle difference between the two in this case that I don't see, and isn't mentioned in the Guide at all.

I did find yet another comprehensive implemenation of CRCs—Danjel McGougan's `universal_crc`, and every version of the non-reflected CRC-32 it generated (it generates either bit-oriented code, or several table-driven implementations based on tradeoffs betweeen speed and memory usage) returned `FC891918` (even it's own bit-oriented version, which isn't the same as the one described in the Guide).

Another thing I noticed by looking deeply into the abyss that is CRC, is that my first implementation of CRC-32 is flawed—I don't exclusive-or the results with all ones at the end. I suspect that the code I based mine on didn't bother with the exclusive-or when returning the CRC, but instead did that elsewhere in the codebase. It's not a bug per se, but according to Numerical Recipes in C:

Second, one can add (XOR) any M-bit constant K to the CRC before it is transmitted … This has the advantage of detecting another kind of erorr that the CRC would otherwise not find: deletion of an initial 1 bit in the message with spurious insertion of a 1 bit at the end of the block.

The result is that there's a type of corruption that I won't catch. This code was the basis for the CRC implementation in a few programs at work (oops) but again, I don't think it's an outright show-stopping bug.

At some point, I may go through some of this on paper, one bit at a time, to see what's going on math-wise with the reflected and non-reflected table implementations with non-0 initial values.

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.