"James Van Buskirk" wrote in message
> There was a thread about population count in comp.lang.fortran but
> nobody seemed to like my assembly code over there:
For what it's worth below I give my 32-bit assembler for popcnt.
The format is for lzasm but can easily be changed for nasm.
The two arguments are the pointer and the number of 32-bit words.
The code is from Hackers Delight converted to 32-bit assembler.
On Core2Duo I measured 13 cycles/limb, where limb=4bytes=32bits.
Also I peeked for popcnt in GMP 4.2.2, see:
http://gmplib.org/
They have an IA-64 Itanium popcnt with 1 cycle/limb (64-bit), using the
IA-64 popcnt instruction.
Their table lookup popcnt runs 8 cycle/limb (32-bit).
Clearly your xmm popcnt1 is also very fast, about 1 cycle/limb (32-bit)!
But because it seems more complicated you should also test it for all
possible 128-bit bit patterns.
Regards, Maarten.
PROC __int_popcnt
ARG pdata:dword, ndata:dword
push ebp
mov ebp, esp
push edi
push esi
mov edi, [ndata]
mov esi, [pdata]
lea esi, [esi-4]
xor eax, eax
@[EMAIL PROTECTED]
mov ecx, [esi+4*edi]
mov edx, ecx
shr edx, 1
and ecx, 55555555h
and edx, 55555555h
add ecx, edx
mov edx, ecx
shr edx, 2
and ecx, 33333333h
and edx, 33333333h
add ecx, edx
mov edx, ecx
shr edx, 4
add ecx, edx
and ecx, 0F0F0F0Fh
mov edx, ecx
shr edx, 8
add ecx, edx
mov edx, ecx
shr edx, 16
add ecx, edx
and ecx, 3Fh
add eax, ecx
dec edi
jnz short @[EMAIL PROTECTED]
pop esi
pop edi
pop ebp
ret
ENDP __int_popcnt


|
33 Posts in Topic:
|
"James Van Buskirk&q |
2008-04-12 03:14:45 |
|
Terence <spamtrap@[EM |
2008-04-12 16:55:40 |
|
Terje Mathisen <spamt |
2008-04-13 15:08:36 |
|
"James Van Buskirk&q |
2008-04-13 09:20:49 |
|
"Maarten Kronenburg& |
2008-04-13 21:48:40 |
|
Jake Waskett <spamtra |
2008-04-13 21:43:32 |
|
"Maarten Kronenburg& |
2008-04-14 01:55:14 |
|
Jake Waskett <spamtra |
2008-04-14 11:19:35 |
|
"James Van Buskirk&q |
2008-04-14 02:38:07 |
|
"Maarten Kronenburg& |
2008-04-14 20:53:32 |
|
"Maarten Kronenburg& |
2008-04-15 17:13:38 |
|
"Maarten Kronenburg& |
2008-04-15 21:58:21 |
|
Terence <spamtrap@[EM |
2008-04-13 17:14:55 |
|
"Wolfgang Kern" |
2008-04-14 12:42:35 |
|
"James Van Buskirk&q |
2008-04-14 13:53:21 |
|
"Wolfgang Kern" |
2008-04-16 15:34:09 |
|
"James Van Buskirk&q |
2008-04-16 10:05:48 |
|
Robert Redelmeier <red |
2008-04-14 14:21:05 |
|
"James Van Buskirk&q |
2008-04-14 02:58:34 |
|
"Maarten Kronenburg& |
2008-04-14 18:09:21 |
|
Terje Mathisen <spamt |
2008-04-15 07:28:26 |
|
Terence <spamtrap@[EM |
2008-04-14 05:00:42 |
|
Terence <spamtrap@[EM |
2008-04-14 15:09:35 |
|
Terence <spamtrap@[EM |
2008-04-15 02:29:34 |
|
Gerd Isenberg <spamtr |
2008-04-15 02:56:39 |
|
"James Van Buskirk&q |
2008-04-16 00:33:16 |
|
"Maarten Kronenburg& |
2008-04-16 14:42:37 |
|
"Maarten Kronenburg& |
2008-04-16 19:38:21 |
|
"James Van Buskirk&q |
2008-04-16 12:41:36 |
|
"Maarten Kronenburg& |
2008-04-16 21:39:47 |
|
"Maarten Kronenburg& |
2008-04-17 16:43:31 |
|
Gerd Isenberg <spamtr |
2008-04-16 09:58:11 |
|
"James Van Buskirk&q |
2008-04-16 12:59:38 |
|