On Apr 29, 11:43 am, Francis Glassborow
<francis.glassbo...@[EMAIL PROTECTED]
> wrote:
> Le Chaud Lapin wrote:
> > If the reasons for adding __asm to C++ were listed in order of
> > im****tance, I'd say speed up of critical operations would be in the
> > top 10. :)
>
> Agreed (a pity that C++'s asm keyword is so poorly sup****ted) but that
> has nothing to do with functions versus macors. The C++ way is to use
> inline functions unless your implementation ignores the inline.
I think what you are saying is: I should use an inline function, then
put _asm inside the function. The function will then be inlined
wherever necessary.
There might be problems with this approach.
Here is a function that computes (2^32-1)/(2^16 - 1)==2^16+1=65,537.
1. EDX:EAX is first packed with a 64-bit all-1's 1111....1 dividend.
2. EAX is packed with all one's 111....1 32-bit divisor (65,535)
3. Result is 65,537 stored "in" EDX:EAX.
unsigned int divide_2_to_32_minus_1_by_2_to_16_minus_1 ()
{
// LOOK HERE: Will compiler always allow programmer to specicy no
extra code here?
__asm mov eax, 0;
__asm mov edx, 0;
__asm mov ebx, 0;
__asm not eax;
__asm not edx;
__asm not ebx;
__asm div ebx;
// AND HERE: Compiler might stubbornly add a few instructions here
without programmer help.
// And what about the return value? Where is it? EAX, on Pentium
machines is standard,
// but in this case, result is in EDX:EAX.
}
On IA-32 architecture in "32-bit mode", the mode most of us will
encounter in normal programming, this will cause hardware integer
overflow exception because 65,537 cannot fit into 32-bit EAX. But in
"64-bit mode", a Pentium will allow the result to be stored in
EDX:EAX. Now imagine you have written some C++ code that calls this
function, and you intend to store the return value. You could hope
that changing the return value of the function from "unsigned int" to
"longlong" works, or you could simply use a macro.
To get around the "is longlong present", one could say, "use
references", then I would be again at the mercy of hope, hope that the
compiler will know to properly optimize away indirection of references
(which I agree would be silly if it did not).
There is also the matter of superfluous instructions being added even
to the inline function, just before the asm block, and just after.
Microsoft's compiler has extensions that allow this code to be
removed, but do all compilers? If there is one that does not, then the
prolog/epilog code could dwarf the execution speed of the inline
assembly for ADD/SUBTRACT, and would have signifcant impact for
MULTIPLY, and depending on how much of it is present, would also
adversely affect DIVIDE.
With macros, I don't have to worry about return value, or reference
indirection elimination, or prolog/epilog code that I might have to
struggle to get rid of. I simply use one line of code for say,
MULTIPLY, and be done with it:
Integer::Word upper_word;
Integer::Word lower_word;
Integer::Word b = B.buffer[j];
MULTIPLY (a, b, upper_word, lower_word); // WYSIWYG for assembly code
in this macro
-Le Chaud Lapin-
--
[ See http://www.gotw.ca/resources/clcm.htm
for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


|