When I replaced punpcklwq with punpckldq in the ASM code, everything
worked as intended after a recompilation. Tried the p6 option, but
couldn't get rid of the strange punpcklwq.
By the way why did the older Matlab bundle v2.4.1 refuse to compile
the _punpckhwq(), despite existence of said intrinsic in mmx.h...???
lcc says:
Internal error 1024 line 48
No such instruction punpckhwq (%eax,%ecx,8),%mm0
Snippets of the ASM code from version 3.8:
[Logiciels/Informatique lcc-win32 version 3.8. Compilation date: Dec
18 2007 19:12:54]
// ------------------------------------------
..line 20
emms
; 21 _punpckldq(&A, &B, 1);
.line 21
movl $1,%edi
movl %edi,%ecx
leal -16(%ebp),%edx
leal -8(%ebp),%eax
orl %ecx,%ecx
je _$LM2
_$LM1:
decl %ecx
movq (%edx,%ecx,8),%mm0
punpcklwq (%eax,%ecx,8),%mm0
movq %mm0,(%eax,%ecx,8)
jne _$LM1
// ------------------------------------------
; 21 _punpckhdq(&A, &B, 1);
.line 21
movl $1,%edi
movl %edi,%ecx
leal -16(%ebp),%edx
leal -8(%ebp),%eax
orl %ecx,%ecx
je _$LM2
_$LM1:
decl %ecx
movq (%edx,%ecx,8),%mm0
punpckhwq (%eax,%ecx,8),%mm0
movq %mm0,(%eax,%ecx,8)
jne _$LM1
// ------------------------------------------


|