A) < Forward Inline >
-------- Original Message --------
Message-ID: <41276DBB.2023556D@[EMAIL PROTECTED]
>
Newsgroups: comp.std.c++
Subject: Re: Multithreaded programming: is the C++ standardization
committee listening?
References: ... <abefd130.0408201501.7b3093e8@[EMAIL PROTECTED]
>
Peter Dimov wrote:
[...]
> One problem is that the most efficient combination of memory barriers
> may differ depending on the platform. The unsup****ted standardized
> barriers will need to turn into full barriers (the most inefficient
> kind.)
The standard shall simply define the ****table barriers model (and of
course including "relaxed"/"tuned" ones). Something like
msync::none // nothing (e.g. for refcount<T, basic>::increment)
msync::fence // classic fence (acq+rel -- see below)
msync::acq // classic acquire (hlb+hsb -- see below)
msync::ddacq // acquire with data dependency
msync::ccacq // acquire with control dependency
msync::hlb // hoist-load barrier -- acquire not affecting stores
msync::ddhlb // ...
msync::cchlb // ...
msync::hsb // hoist-store barrier -- acquire not affecting loads
msync::ddhsb // ...
msync::cchsb // ...
msync::rel // classic release (slb+ssb -- see below)
msync::slb // sink-load barrier -- release not affecting stores
msync::ssb // sink-store barrier -- release not affecting loads
msync::slfence // store-load fence (ssb+hlb -- see above)
Here's a few illustrations.
DCSI-MBR:
class stuff : private lazy_mutex { // "create/open named mutex"
// trick on windows
atomic<lazy const *> m_ptr;
public:
/* ... */
lazy const & lazy_instance() {
lazy const * ptr;
if (!(ptr = m_ptr.load(msync::ddhlb))) {
lazy_mutex::guard guard(this);
if (!(ptr = m_ptr.load(msync::none)))
m_ptr.store(ptr = new lazy(), msync::ssb);
}
return *ptr;
}
}
http://groups.yahoo.com/group/boost/message/15442
(that's "lazy mutex")
DCCI: (double-checked concurrent init)
class stuff {
atomic<lazy const *> m_ptr;
public:
/* ... */
lazy const & lazy_instance() {
lazy const * ptr;
if (!(ptr = m_ptr.load(msync::ddhlb)) &&
!m_ptr.attempt_update(0, ptr = new lazy(), msync::ssb)) {
delete ptr;
ptr = m_ptr.load(msync::ddhlb);
}
return *ptr;
}
}
Sorta better "critical-section" for windows.
// doesn't provide "POSIX-safety" with respect to destruction
class swap_based_mutex { // noncopyable
atomic<int> m_lock_status; // 0: free, 1/-1: locked/contention
auto_reset_event m_retry_event; // bin.sema/gate
public:
// ctor/dtor [w/o lazy event init]
void lock() throw() {
if (m_lock_status.swap(1, msync::ccacq))
while (m_lock_status.swap(-1, msync::ccacq))
m_retry_event.wait();
}
bool trylock() throw() {
return !m_lock_status.swap(1, msync::ccacq) ?
true : !m_lock_status.swap(-1, msync::ccacq);
}
bool timedlock(absolute_timeout const & timeout) throw() {
if (m_lock_status.swap(1, msync::ccacq)) {
while (m_lock_status.swap(-1, msync::ccacq))
if (!m_retry_event.timedwait(timeout))
return false;
}
return true;
}
void unlock() throw() {
if (m_lock_status.swap(0, msync::rel) < 0)
m_retry_event.set();
}
};
regards,
alexander.
B) < Forward Inline >
-------- Original Message --------
Message-ID: <414E9206.53D31982@[EMAIL PROTECTED]
>
Newsgroups: comp.lang.c++.moderated
Subject: Re: Possible solution to the DCL problem (Scott Meyers, Andrei
Alexandrescu)
References: ... <MPG.1bb51ea3e3f45c59989797@[EMAIL PROTECTED]
>
Scott Meyers wrote:
>
> On 16 Sep 2004 22:41:25 -0400, Alexander Terekhov wrote:
>
> > Keyboard* temp = pInstance;
> > Perform acquire;
> > ...
> >
> > is not really the same (with
> > respect to reordering) as
> >
> > Keyboard* temp = pInstance;
> > Lock L1(args); // acquire
> > ...
> >
> > because the later can be transformed to
> >
> > Lock L1(args); // acquire
> > Keyboard* temp = pInstance;
> > ...
>
> Upon further reflection, I don't see your point here. As I understand
it,
> the acquire operation, whether explicit or as part of the Lock
constructor,
> prevents memory accesses after (in program order) the initialization of
> temp from migrating up above temp's initialization.
No.
Keyboard* temp = pInstance;
Lock L1(args); // acquire
temp2 = ...
Doesn't prevent
Lock L1(args); // acquire
temp2 = ...
Keyboard* temp = pInstance;
transformation.
With atomic<Keyboard*> for pInstance
Keyboard* temp = pInstance.load(msync::acq);
temp2 = ...
it does. IOW, "acquire" is a marker. It's one thing to mark
"Keyboard* temp = pInstance", and it's completely different thing
to mark some other access like "Lock L1(args)" in your case.
> You seem to be
> suggesting that that is true in the explicit case but not in the case of
> the acquire being part of the Lock constructor.
See above.
> If that is what you are
> arguing, can you please explain why that is the case? If that is not
what
> you are arguing, can you please clarify your argument?
Yes, that's what I'm arguing. See above.
[...]
> I've reread the Plan9 posting, and I still don't see your point. The
> transition from page 34 to page 35 of my notes involves replacement of
an
> explicit acquire with a Lock constructor (i.e., an implicit acquire).
You
> seem to think that there is a change in semantics,
Yes.
> one that offers fewer
> memory visibility guarantees. The only comment about locks in the Plan9
> discussion is this one:
>
> One bit of reassurance: any data structure protected by a spin
> lock is safe. Here's why:
>
> P1 P2
> [already holding lock] wait for lock->busy == 0
> store data->x grab lock
> store data->y use data->x and ->y
> lock->busy = 0
>
> Because of processor ordering, when P2 observes lock->busy == 0,
> it also has observed all prior stores by P1. Hence P2 never gets
> an inconsistent view of P1's updates.
>
> Furthermore, regarding locks and acquire/release, David Butenhof makes
this
> remark (in the comp.programming.threads posting to which you responded
> with the link to the Plan9 story):
Locks aside for a moment, Plan9 story illustrates the use of
store-load fence on IA32.
: What we need is that if the following sequence is executed
:
: P1: P2:
: x = 0 y = 0
: x = 1 y = 1
: read y read x
:
: has the values read will be one of
:
: 1 0
: 0 1
: 1 1
:
: 0,0 blows us away.
P1: P2:
x = 0 y = 0
x = 1 y = 1
slfence slfence
read y read x
Is safe.
>
> Because there's no way for software or hardware to reliably associate
any
> particular data with a particular mutex, in practice any thread that
locks
> any mutex will have a view of (all) memory consistent with the view
of the
> last thread to unlock any mutex (at the time of that unlock). This is
> essentially as if you replaced the mutex lock and unlock operations
by
> general (full) memory barriers.
Butenhof is wrong.
>
> This suggests that replacement of an explicit acquire operation by mutex
> acquisition replaces an acquire by a full fence, hence *increasing* the
> memory visibility guarantees.
He doesn't seem to "get it", unfortunately.
http://groups.google.com/groups?threadm=40ae0044%40usenet01.boi.hp.com
>
> So, as I said, I don't see your point.
You're in good company. ;-)
[ ... atomic<> ... ]
> Also, again, what is the meaning of atomic<>? If I had to guess, I'd
guess
> that all operations on the type atomic<T> are guaranteed to be seen as
> atomic, but I'd prefer not to guess.
Yes, atomic operations with a whole bunch of reordering constraints
to ensue proper memory visibility. Like in
template<typename numeric>
class refcount<numeric, basic> {
public:
enum may_not_store_min_t { may_not_store_min };
private:
atomic<numeric> m_value;
template<typename min_msync, typename update_msync>
bool decrement(min_msync mms, update_msync ums) throw() {
numeric val;
do {
val = m_value.load(msync::none);
assert(min() < val);
if (min() + 1 == val) {
m_value.store(min(), mms);
return false;
}
} while (!m_value.attempt_update(val, val - 1, ums));
return true;
}
class sp_counted_base {
/* ... */
refcount<std::size_t, basic> use_count_;
refcount<std::size_t, basic> self_count_;
/* ... */
public:
/* ... */
sp_counted_base() : use_count_(1), self_count_(1) { }
std::size_t use_count() const throw() {
return use_count_.get();
}
void add_ref() throw() {
use_count_.increment();
}
bool lock() throw() {
return use_count_.increment_if_not_min();
}
void weak_add_ref() throw() {
self_count_.increment();
}
void weak_release() throw() {
if (!self_count_.decrement(msync::acq))
destruct();
}
void release() throw() {
if (!use_count_.decrement()) {
dispose();
if (!self_count_.decrement(msync::rel))
destruct();
}
} /* ... */
};
http://www.terekhov.de/pthread_refcount_t/experimental/refcount.cpp
regards,
alexander.


|