On Sep 15, 3:16=A0pm, "Dmitriy V'jukov" <dvyu...@[EMAIL PROTECTED]
> wrote:
> On Sep 14, 4:47=A0am, climber....@[EMAIL PROTECTED]
wrote:
>
> > > Here are benchmark results on Q6600.
> > > Element count =3D 256, readonly workload, 8 threads.
> > > TBB concurrent_hash_map:
> > > Single core: 338 cycles/op
> > > Cores 0&1: 436 cycles/op (scaling 1.55)
> > > Cores 0&2: 1080 cycles/op (scaling 0.63)
> > > All cores: 1412 cycles/op (scaling 0.96)
>
> > Hi, I am also trying to test some multithreaded programs, but I don't
> > know what tool(s) to use to get the cpu cycle measurement. I really
> > would like to know how you obtain the benmark result.
>
> I run simple benchmark for around 10 seconds. Every thread count
> number of executed operations in thread local counter. After 10
> seconds I stop all threads, and sum all thread local counters. And
> then divide execution time measured in cycles by total number of
> operations. Execution time I measure with rdtsc instruction.
> Usually results very stable, i.e. deviation is no more than few
> cycles.
How do you get rid of the task switches and the time, spent in
other threads? Some people just assume that if task switch
occurred, then time, spent inside a calculation cycle deviate
a lot from the mean and just through away this timing.
Do you use the same technique?
Alexander Chemeris.


|