Chris Thomasson wrote:
> Before I convert this code into AT&T syntax for GAS to assemble, and to
> MASM I was wondering if there are any possible optimizations I can
> perform on the following code, I will show the C header first:
The code looks OK, at least on a first read-through, but you really
don't need to worry about asm performance details!
The LOCK CMPXCHG etc instructions will take so much time that tiny
things like using XOR EAX,EAX instead of MOV EAX,0 to save code space
doesn't matter.
The one thing I would change would be the function call/return format:
You are using the cdecl default of stackbased parameters and caller
stack cleanup, but you only use one or two parameters for most
functions, right?
If you instead use register-based calling each call can be simpler &
faster, and the implementation becomes shorter as well:
I.e. your release function looks like this:
> __declspec(****d) void
> spinlock_i686_release(
> atomicword_i686* const _this
> ) {
> _asm {
> MOV ECX, [ESP + 4]
> MOV EAX, 0
> MOV [ECX], EAX
> RET
> }
> }
That is 9+ code bytes, plus a PUSH and a POP (or ADD ESP,4) at each call
site.
If you can pass the single parameter in ECX, the code becomes:
_asm {
xor eax,eax
mov [ecx],eax
ret
}
which can also be simplified, at the cost of a code byte or two, to
_asm {
mov dword ptr [ecx], 0
ret
}
but at this point it should be obvious that this particular function can
be implemented as a single-line inline macro, since there is nothing
here except for an atomic store. :-)
The lock-based functions will also save some space with register
parameters:
nblifo_i686_push(
nblifo_i686* const _this,
nbnode_i686* const node
) {
_asm {
MOV EAX, [EDX]
nblifo_i686_push_retry:
MOV [ECX], EAX
LOCK CMPXCHG [EDX], ECX
JNE nblifo_i686_push_retry
RET
}
}
Terje
--
- <Terje.Mathisen@[EMAIL PROTECTED]
>
"almost all programming can be viewed as an exercise in caching"


|