Hello again... thanks for the answers so far:
Dave Butenhof writes:
>Hallvard B Furuseth wrote:
>> What do POSIX and other thread APIs have to say about memory layout
>> of data accessed by different treads? E.g.:
>>
>> struct Foo {
>> pthread_mutex_t ma, mb;
>> int a, b; /* protected by ma and mb respectively */
>> int c; /* not mutex-protected, only accessed by 1 thread */
>> };
> (...)
> Generally int will be safe. Although on some machines, an UNALIGNED int
> is legal but not necessarily atomic.
> Both C and C++ are working now on detailed memory models that will nail
> down standard and ****table answers to this sort of
> question. Unfortunately it's all well below the level that could be
> resolved in POSIX.
>
> As David Schwartz suggested, however, you're generally going to be
> better off keeping independently accessed data as "distinct objects"
> rather than compacting them into a common container. If you keep them
> distinctly separate, you have a much better chance that your code, the
> compiler, the memory allocator in your runtime, and the hardware will
> all get along.
Makes sense. What would you do if you want data structures with
differently-protected data in the same struct though? Ensure at least
<void* and int>-sized padding/alignment, or maybe insert the data in a
union with a mutex which hopefully gets aligned sensibly, or more?
As you say, false sharing can often make it best to keep them apart in
memory anyway. But not always - e.g. with write-seldom/read-often, the
extra indirection to move between two parts of an object might cost more
- in either runtime or coding/maintenance.
If one is ****ting a single-threaded application to multi-threaded, the
first concern is to get it to work, without too much extra work in
reorganizing a lot of data structures.
E.g.: <element> is protecteed by element.mutex, except it is in a linked
list so the element.next pointer is proteced by listhead.mutex. Or it
is a hash element and hashtable.mutex protects element.hashvalue, which
will be recomputed when the table is resized.
> (snipping false sharing performance issues)
>
> Of course, if your MUTEX is in the same cache line as data that's not
> associated with the mutex (or even worse that's associated with a
> different mutex), you've got the same problem.
Yup, it was a deliberately obtuse data structure:-)
> For example, in your
> case, your mb mutex is directly adjacent to data a. A pattern like
>
> pthread_mutex_t ma;
> int a;
> pthread_mutex_t mb;
> int b;
>
> would be better, but provides no cache line separation between ma/a and
> mb/b. In many cases an unused array like "char blank[CACHE_LINE_SIZE];"
> between the two sets would help.
Or maybe like this?
struct {
union {
struct {
pthread_mutex_t ma;
int a;
} u;
char blank[CACHE_LINE_SIZE];
} au;
pthread_mutex_t mb;
int b;
...
}
> In fact many threaded applications have
> a "cache aligned allocator" wrapping malloc() that expands the requested
> size and aligns the returned address to a cache line boundary (machine
> sensitive) in order to ensure that the caller gets a unique cache
> line. Then you might have a 'typedef Shared_data_t { pthread_mutex_t m;
> int data; }', of which you'd separately allocate an 'a' instance and a
> b' instance to ensure that the two sets are independent.
That's a good tip. Thanks.
> And of course, this sort of planning makes it completely irrelevant
> whether adjacent 'int' data is atomically safe. It's almost always far
> better to be sure that they can never be adjacent, or even close,
> anyway... regardless of atomicity.
>
> There's LOTS more someone could (and many might) say about all this; but
> at least this is a rough introduction. ;-)
--
Hallvard


|