The Boston Diaries

Tuesday, July 08, 2008

And I thought I was crazy for sticking with C …

Wow.

An XML parser written in x86 assembly language for Unix and Windows. This sounds like something I would have done about fifteen years ago when I programmed almost exclusively in assembly.

The code makes some interesting design choices (I mean, besides the obvious choice of langauge)—it only parses XML in memory and as a result of that, the entire document must first be in memory. It also allocates memory in large blocks (the size in some of the examples is 16M) to avoid the overhead of calling malloc().

It also doesn't support XML namespaces (nor does the author want to support XML namespaces, as namespaces “make parsing more complicated and slower”).

But still … wow.

A quick dip back into assembly with some curious results …

Speaking of assembly …

One of the instructions of the x86 architecture I've been curious about is ENTER. Oh, I know it's there to support higher level languages like C and Pascal that use stack frames for local variables. It even supposedly supports nested function definitions (ala Pascal) using the second operand as a kind of “nesting level.”

But I've never seen an actual instance of ENTER used with a “nesting level” greater than 0. The only instance I've ever seen used has been

ENTER	n,0

Which is equivilent to

PUSH	EBP	; or BP if 16-bit code
MOV	EBP,ESP
SUB	ESP,n

(And in fact, that sequence is generated by GCC as it's actually faster than ENTER n,0 and C doesn't allow nested functions to begin with.)

But being curious about what ENTER actually does, I decided to play around with it. I wrote some simple code:

		bits	32
		global	sub0
		extern	pmem

		section	.text

sub0		enter	8,0
		mov	eax,0DEADBEEFh
		mov	[ebp-4],eax
		mov	eax,0CAFEBABEh
		mov	[ebp-8],eax
		lea	ebx,[ebp+4]
		push	dword 0c0000001h
		call	sub1
		leave	
		ret

sub1		enter	8,1
		mov	eax,0DEADBEEFh
		mov	[ebp-4],eax
		mov	eax,0CAFEBABEh
		mov	[ebp-8],eax
		push	dword 0c0000002h
		call	sub2
		leave
		ret

sub2		enter	8,2
		mov	eax,0DEADBEEFh
		mov	[ebp-4],eax
		mov	eax,0CAFEBABEh
		mov	[ebp-8],eax
		push	dword 0c0000003h
		call	sub3
		leave
		ret

sub3		enter	8,3
		mov	eax,0DEADBEEFh
		mov	[ebp-4],eax
		mov	eax,0CAFEBABEh
		mov	[ebp-8],eax
		push	dword 0c0000004h
		call	sub4
		leave
		ret

sub4		enter	8,4
		mov	eax,0DEADBEEFh
		mov	[ebp-4],eax
		mov	eax,0CAFEBABEh
		mov	[ebp-8],eax
		push	dword 0
		push	dword 0
		push	ebx
		push	esp
		call	pmem
		add	esp,16
		leave
		ret

And the following C code:

#include <stdio.h>
#include <stdlib.h>

extern void sub0(void);

void pmem(unsigned long *pl,unsigned long *ph)
{
  assert(pl < ph);

  while(ph >= pl - 2)
  {
    printf("\t%08lX: %08lX\n",(unsigned long)ph,*ph);
    ph--;
  }  
}

int main(void)
{
  sub0();
  return EXIT_SUCCESS;
}

Nothing horribly complicated here. pmem() just dumps the stack, and the various sub*() routines create deeper nestings of stack activation records while creating enough space to store two four-byte values. The results though?

Curious (comments added by me after the run) …

BFFFFD1C: 0804853C		return addr to main()
BFFFFD18: BFFFFD20	stack frame sub0
BFFFFD14: DEADBEEF		local0
BFFFFD10: CAFEBABE		local1
BFFFFD0C: C0000001		marker for calling sub1
BFFFFD08: 08048591		return addr to sub0
BFFFFD04: BFFFFD18	stack frame sub1
BFFFFD00: DEADBEEF		local0
BFFFFCFC: CAFEBABE		local1
BFFFFCF8: 08049708		?
BFFFFCF4: C0000002		marker for calling sub2
BFFFFCF0: 080485B1		return addr to sub1
BFFFFCEC: BFFFFD04	stack frame sub2
BFFFFCE8: DEADBEEF		local0
BFFFFCE4: CAFEBABE		local1
BFFFFCE0: 00000002		?
BFFFFCDC: 400079D4		?
BFFFFCD8: C0000003		marker for calling sub3
BFFFFCD4: 080485D1		return addr to sub2
BFFFFCD0: BFFFFCEC	stack frame sub3
BFFFFCCC: DEADBEEF		local0
BFFFFCC8: CAFEBABE		local1
BFFFFCC4: BFFFFCD0		? sf3
BFFFFCC0: 4000F000		?
BFFFFCBC: 02ADAE54		?
BFFFFCB8: C0000004		marker for calling sub4
BFFFFCB4: 080485F1		return addr to sub3
BFFFFCB0: BFFFFCD0	stack frame sub4
BFFFFCAC: DEADBEEF		local0
BFFFFCA8: CAFEBABE		local0
BFFFFCA4: BFFFFCD0		? sf3
BFFFFCA0: BFFFFCB0		? sf4
BFFFFC9C: 40011FE0		?
BFFFFC98: 00000001		?
BFFFFC94: 00000000		push dword 0
BFFFFC90: 00000000		push dword 0
BFFFFC8C: BFFFFC8C		? supposed to be ebx
BFFFFC88: BFFFFC8C		? supposed to be esp
BFFFFC84: 08048618		return addr to sub4

From my understanding of what ENTER does, each “level” creates a type of nested stack activation record with pointers to each previous “level's” stack record. And while each level has the required number of additional entries, the actual contents don't make sense.

Running this on a different Linux system produced similarly confusing results. I'm not sure if ENTER is horribly broken these days (I wonder how often the instruction is actually used), or perhaps, it is indeed a Linux problem? Not that I'm going to be using assembly any time soon … I'm just curious.

Tuesday, July 08, 2008

And I thought I was crazy for sticking with C …

A quick dip back into assembly with some curious results …

Obligatory Picture

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer