Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Assembly x86 > Re: optimizatio...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 2 of 13 Topic 4621 of 4729
Post > Topic >>

Re: optimization possibilities...

by Terje Mathisen <spamtrap@[EMAIL PROTECTED] > Apr 19, 2008 at 07:30 AM

Chris Thomasson wrote:
> Before I convert this code into AT&T syntax for GAS to assemble, and to 
> MASM I was wondering if there are any possible optimizations I can 
> perform on the following code, I will show the C header first:

The code looks OK, at least on a first read-through, but you really 
don't need to worry about asm performance details!

The LOCK CMPXCHG etc instructions will take so much time that tiny 
things like using XOR EAX,EAX instead of MOV EAX,0 to save code space 
doesn't matter.

The one thing I would change would be the function call/return format:

You are using the cdecl default of stackbased parameters and caller 
stack cleanup, but you only use one or two parameters for most 
functions, right?

If you instead use register-based calling each call can be simpler & 
faster, and the implementation becomes shorter as well:

I.e. your release function looks like this:

> __declspec(****d) void
> spinlock_i686_release(
> atomicword_i686* const _this
> ) {
>  _asm {
>    MOV ECX, [ESP + 4]
>    MOV EAX, 0
>    MOV [ECX], EAX
>    RET
>  }
> }

That is 9+ code bytes, plus a PUSH and a POP (or ADD ESP,4) at each call 
site.

If you can pass the single parameter in ECX, the code becomes:

  _asm {
    xor eax,eax
    mov [ecx],eax
    ret
  }

which can also be simplified, at the cost of a code byte or two, to

  _asm {
    mov dword ptr [ecx], 0
    ret
  }

but at this point it should be obvious that this particular function can 
be implemented as a single-line inline macro, since there is nothing 
here except for an atomic store. :-)

The lock-based functions will also save some space with register
parameters:

nblifo_i686_push(
nblifo_i686* const _this,
nbnode_i686* const node
) {
  _asm {
    MOV EAX, [EDX]

nblifo_i686_push_retry:
    MOV [ECX], EAX
    LOCK CMPXCHG [EDX], ECX
    JNE nblifo_i686_push_retry
    RET
  }
}

Terje
-- 
- <Terje.Mathisen@[EMAIL PROTECTED]
>
"almost all programming can be viewed as an exercise in caching"
 




 13 Posts in Topic:
optimization possibilities...
"Chris Thomasson&quo  2008-04-18 17:30:10 
Re: optimization possibilities...
Terje Mathisen <spamt  2008-04-19 07:30:55 
Re: optimization possibilities...
"Chris Thomasson&quo  2008-04-19 14:06:11 
Re: optimization possibilities...
Timothy Baldwin <spam  2008-04-19 11:40:51 
Re: optimization possibilities...
"Chris Thomasson&quo  2008-04-19 12:27:17 
Re: optimization possibilities...
Timothy Baldwin <spam  2008-04-22 20:36:39 
Re: optimization possibilities...
"Chris Thomasson&quo  2008-04-22 16:58:52 
Re: optimization possibilities...
"Chris Thomasson&quo  2008-04-19 16:53:01 
Re: optimization possibilities...
"Alexei A. Frounze&q  2008-04-19 00:15:30 
Re: optimization possibilities...
"Chris Thomasson&quo  2008-04-19 12:41:01 
Re: optimization possibilities...
"Alexei A. Frounze&q  2008-04-20 02:12:38 
Re: optimization possibilities...
"Chris Thomasson&quo  2008-04-20 11:33:37 
Re: optimization possibilities...
"Chris Thomasson&quo  2008-06-10 23:30:33 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri Jul 25 15:09:44 CDT 2008.