Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Forth > Re: Why is GFor...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 4 of 30 Topic 4030 of 4173
Post > Topic >>

Re: Why is GForth-ITC fast?

by anton@[EMAIL PROTECTED] (Anton Ertl) May 1, 2008 at 10:54 AM

Robert Spykerman <robert.spykerman@[EMAIL PROTECTED]
> writes:
>On May 1, 12:30 pm, brian....@[EMAIL PROTECTED]
 wrote:
>> I have been dis-assembling pieces of GForth and I don't understand why
>> it is generally faster than other ITC Forths that are written in
>> Assembler like CI-Forth or my old 16 bit HS/Forth.

It is?  On what hardware?  On what benchmarks?  How much?

>1. As you know ciforth is really really small. I am wondering about
>cache issues. As you know a lot of modern pentia are von Neumann
>outside but actually are Harvard internally. I believe from memory the
>separate L1 data/instr caches of a CORE are 32 k bytes or something
>like that. I can't remember sizes of cache lines but as you know data
>and executable IA-32 code are definitely within this 20-30k area...

Yes, cache consistency issues are usually a good candidate for
explaining unexpected slowdowns.  However, I would not expect that to
play a significant role with ITC on modern CPUs (it is pretty bad with
traditional ITC on Pentium, Pentium MMX, and K6 series CPUs).

Modern CPUs only suffer from writes in the same consistency region
that code resides in.  The size of the consistency region is 64 Bytes
for the K7/K8/K10, AFAIK 32 Bytes on the Pentium Pro ... Pentium 3,
Pentium M, Core, and Core 2 family, and 1KB on the Pentium 4.  So one
would have to place frequently-written variables or buffers pretty
close to primitives to get hit by that.

One can measure that by using performance counters and looking at the
I-cache and D-cache misses.

>2. Most of ciforth is actually written in forth. I wonder if it's the
>same in gforth.

Apart from about 300 primitives, yes, it's the same.  Hmm, the
additional primitives may be helpful for the prediction accuracy of
the indirect JMP.  This can also be checked with performance counters.

>3. I've heard some say that lodsw (ie for NEXT) is slow on modern
>pentia and I've been wondering about whether changing that to an idiom
>like doing a manual load of EAX from [ESI], bumping it up and then jmp
>ing to [EAX] may be better.

Hmm, I thought that this has become better than in the 486 and Pentium
days, but looking in the Athlon Optimization Guide (ok, already 8
years old itself, but the Athlon 64 (K8) is not that different in
these areas from the Athlon (K7)), I find that LODSD has a latency of
4 cycles, more than the equivalent MOV/ADD sequence.  It is also a
VectorPath instruction, so it needs its own decode cycle.  Still, I
would be surprised if that's it, but that's easy to check by replacing
all occurences of LODSD with

MOV EAX, [ESI]
ADD ESI, 4

One could then also use a different register than EAX and schedule the
MOV further up, which should be helpful when the JMP mispredicts
(about half of the instructions).

Concerning the PUSH and POP instructions, on the Athlon all POPs are
VectorPath (slower decode) with 4 cycles latency, and the simple PUSHs
are DirectPath (fast Decode) instructions with 3 cycles latency.  The
K10 (Phenom) has special hardware that speeds up PUSH and POP, but I
think the K8 (Athlon 64 (X2)) is still pretty similar to the Athlon in
this area, so one probably should avoid them.  Certainly something
like @[EMAIL PROTECTED]
 should be done without PUSH and POP, and + should be done with
at most one POP.

I don't have the Intel Optimization manual at hand, but these
instructions should be pretty similar to the sequences of simple
instructions on the Pentium 4 with its trace cache, and IIRC the Core
microarchitecture (Core 2 CPUs, not Core CPUs; thank you, Intel
marketing) has special hardware for PUSH and POP, like the K10.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2008:
http://www.complang.tuwien.ac.at/anton/euroforth/ef08.html
 




 30 Posts in Topic:
Why is GForth-ITC fast?
brian.fox@[EMAIL PROTECTE  2008-04-30 19:30:05 
Re: Why is GForth-ITC fast?
"winston19842005@[EM  2008-04-30 21:29:44 
Re: Why is GForth-ITC fast?
Robert Spykerman <robe  2008-05-01 01:42:28 
Re: Why is GForth-ITC fast?
anton@[EMAIL PROTECTED]   2008-05-01 10:54:45 
Re: Why is GForth-ITC fast?
Albert van der Horst <  2008-05-01 17:26:57 
Re: Why is GForth-ITC fast?
stephenXXX@[EMAIL PROTECT  2008-05-01 22:22:50 
Re: Why is GForth-ITC fast?
stephenXXX@[EMAIL PROTECT  2008-05-01 09:00:18 
Re: Why is GForth-ITC fast?
Robert Spykerman <robe  2008-05-01 04:43:05 
Re: Why is GForth-ITC fast?
Thomas Pornin <pornin@  2008-05-01 12:45:38 
Re: Why is GForth-ITC fast?
stephenXXX@[EMAIL PROTECT  2008-05-01 12:56:26 
Re: Why is GForth-ITC fast?
Andrew Haley <andrew29  2008-05-01 11:41:05 
Re: Why is GForth-ITC fast?
anton@[EMAIL PROTECTED]   2008-05-01 18:22:56 
Re: Why is GForth-ITC fast?
Bernd Paysan <bernd.pa  2008-05-01 22:01:54 
Re: Why is GForth-ITC fast?
stephenXXX@[EMAIL PROTECT  2008-05-01 22:45:04 
Re: Why is GForth-ITC fast?
Thomas Pornin <pornin@  2008-05-01 23:44:26 
Re: Why is GForth-ITC fast?
Andrew Haley <andrew29  2008-05-02 04:27:59 
Re: Why is GForth-ITC fast?
Thomas Pornin <pornin@  2008-05-01 12:31:12 
Re: Why is GForth-ITC fast?
Albert van der Horst <  2008-05-01 16:33:12 
Re: Why is GForth-ITC fast?
brian.fox@[EMAIL PROTECTE  2008-05-01 17:12:16 
Re: Why is GForth-ITC fast?
mhx@[EMAIL PROTECTED] (M  2008-05-02 02:53:47 
Re: Why is GForth-ITC fast?
brian.fox@[EMAIL PROTECTE  2008-05-01 19:44:40 
Re: Why is GForth-ITC fast?
brian.fox@[EMAIL PROTECTE  2008-05-01 17:16:42 
Re: Why is GForth-ITC fast?
Robert Spykerman <robe  2008-05-01 20:17:47 
Re: Why is GForth-ITC fast?
Robert Spykerman <robe  2008-05-01 21:06:05 
Re: Why is GForth-ITC fast?
Thomas Pornin <pornin@  2008-05-02 13:00:46 
Re: Why is GForth-ITC fast?
Albert van der Horst <  2008-05-02 16:46:32 
Re: Why is GForth-ITC fast?
Thomas Pornin <pornin@  2008-05-02 18:11:41 
Re: Why is GForth-ITC fast?
Robert Spykerman <robe  2008-05-02 06:12:42 
Re: Why is GForth-ITC fast?
Robert Spykerman <robe  2008-05-02 17:40:35 
Re: Why is GForth-ITC fast?
Albert van der Horst <  2008-05-03 10:12:44 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri Jul 25 22:17:34 CDT 2008.