Tuesday, Debtember 19, 2023
The Gopher Situation, part III, The Search For Uptime
It's been over two weeks and the gopher server has been up and running for all that time. Yup, it was Unicode. Or rather, my inability to wrap Unicode properly.
A bit of background on compilers exploiting signed overflow
Why do compilers even bother with exploiting undefinedness signed overflow? And what are those mysterious cases where it helps?
A lot of people (myself included) are against transforms that aggressively exploit undefined behavior, but I think it's useful to know what compiler writers are accomplishing by this.
TL;DR: C doesn't work very well if int!=register width, but (for backwards compat) int is 32-bit on all major 64-bit targets, and this causes quite hairy problems for code generation and optimization in some fairly common cases. The signed overflow UB exploitation is an attempt to work around this.
Via Comment on ”Bug in my code from compiler optimization [video] | Hacker News”, A bit of background on compilers exploiting signed overflow
A cautionary tale about compiler writers exploiting undefined behavior. I don't have much to add here, other than to spread a bit of awareness of why this happens.
Timing code from inside an assembler
Back in March, I wrote about some 6809 optimizations where I counted CPU cycles by hand. I came across that code the other day and thought to myself, my 6809 emulator counts cycles, and I've embedded it into my 6809 assembler—how hard could it be to time code in addition to testing it?
Turns out—not terribly hard.
I added an option to the .TRON
directive to count cycles instead of printing code execution and have the .TROFF
directive print the cycle count
(indirectly,
since the code isn't run until the end of the second pass of the assembler).
Then I wrote up a few tests:
.test "ROM-RAMx1-byte" ldx #$8000 .tron timing r2r1 sta $FFDE lda ,x sta $FFDF sta ,x+ cmpx #$FF00 bne r2r1 .troff rts .endtst ;***************************************************************** .test "ROM-RAMx2-byte" ldx #$8000 .tron timing r2r2 sta $FFDE ldd ,x sta $FFDF std ,x++ cmpx #$FF00 bne r2r2 .troff rts .endtst ;***************************************************************** .test "ROM-RAMx4-byte" ldx #$8000 .tron timing r2r4 sta $FFDE ldd ,x ldu 2,x sta $FFDF std ,x++ stu ,x++ cmpx #$FF00 bne r2r4 .troff rts .endtst ;***************************************************************** .test "ROM-RAMx8-byte" savesp equ $0100 orcc #$50 sts savesp lds #$FF00 - 8 .tron timing r2r8 sta $FFDE puls u,x,y,d sta $FFDF pshs u,x,y,d leas -8,s cmps #$8000 - 8 bne r2r8 .troff lds savesp andcc #$AF rts .endtst
And upon running it:
GenericUnixPrompt% a09 -ftest r2r.asm ROM-RAMx1-byte:13: cycles=877824 ROM-RAMx2-byte:28: cycles=487680 ROM-RAMx4-byte:45: cycles=357632 ROM-RAMx8-byte:64: cycles=199136
The results match what I calculated by hand, so that's good. It also found a bug in the emulator—I had the wrong cycle count for one of the instructions. It's a bit scary how easy it has become to test 6809 assembly code now that I can do much of it when assembling the code.