On Apr 30, 6:16 pm, herumi <spamt...@[EMAIL PROTECTED]
> wrote:
> Hi Phil,
>
> >I notice that for code alignment you use a sequence of
> >individual nops. Might the following be useful?
>
> Thank you for your advice.
> I had tried to add the optimized nop when I was implementing align()
> before.
>
> But the best way to optimize nop is different according to type of
> CPU,
> then detection of the type is necessary.
>
> Though I can write the code, I don't think the function should be in
> xbyak.h.
> I intend to make xbyak_util.h(for example) and add the function in the
> header.
>
> cf.
> "Software Optimization Guide for AMD64
Processors"http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_do...
> 4.12 Code Padding with Operand-Size Override and NOP
>
> 90
> 66 90
> 66 66 90
> 66 66 66 90
> 66 66 90 66 90
These should work, but some of them are multiple instructions.
> "Intel 64 and IA-32 Architectures Optimization Reference Manual"
> 3.5.1.8 Using
NOPshttp://download.intel.com/design/PentiumII/manuals/24512701.pdf
>
> 90 ; xchg, eax, eax
This is a true NOP.
> 89 C0 ; mov eax, eax
^^^ this one isn't a true NOP in 64-bit mode because of extension to
64 bits.
> 8D 40 00 ; lea eax, [eax + 0x00]
Nor is this one for the same reason.
As far as I can tell by looking at both intel and AMD processor
manuals, 0F 1F + ModRM is a common true multi-byte NOP. It's
availability depends on CPUID, though.
Alex


|