This moves the implementation details of atomic wait/notify functions
into the library, so that only a small API surface is exposed to users.
This also fixes some race conditions present in the design for proxied
waits:
- The stores to _M_ver in __notify_impl must be protected by the mutex,
and the loads from _M_ver in __wait_impl and __wait_until_impl to
check for changes must also be protected by the mutex. This ensures
that checking _M_ver for updates and waiting on the condition_variable
happens atomically. Otherwise it's possible to have: _M_ver == old
happens-before {++_M_ver; cv.notify;} which happens-before cv.wait.
That scenario results in a missed notification, and so the waiting
function never wakes. This wasn't a problem for Linux, because the
futex wait call re-checks the _M_ver value before sleeping, so the
increment cannot interleave between the check and the wait.
- The initial load from _M_ver that reads the 'old' value used for the
_M_ver == old checks must be done before loading and checking the
value of the atomic variable. Otherwise it's possible to have:
var.load() == val happens-before {++_M_ver; _M_cv.notify_all();}
happens-before {old = _M_ver; lock mutex; if (_M_ver == old) cv.wait}.
This results in the waiting thread seeing the already-incremented
value of _M_ver and then waiting for it to change again, which doesn't
happen. This race was present even for Linux, because using a futex
instead of mutex+condvar doesn't prevent the increment from happening
before the waiting threads checks for the increment.
The first race can be solved locally in the waiting and notifying
functions, by acquiring the mutex lock earlier in the function. The
second race cannot be fixed locally, because the load of the atomic
variable and the check for updates to _M_ver happen in different
functions (one in a function template in the headers and one in the
library). We do have an _M_old data member in the __wait_args_base
struct which was previously only used for non-proxy waits using a futex.
We can add a new entry point into the library to look up the waitable
state for the address and then load its _M_ver into the _M_old member.
This allows the inline function template to ensure that loading _M_ver
happens-before testing whether the atomic variable has been changed, so
that we can reliably tell if _M_ver changes after we've already tested
the atomic variable. This isn't 100% reliable, because _M_ver could be
incremented 2^32 times and wrap back to the same value, but that seems
unlikely in practice. If/when we support waiting on user-defined
predicates (which could execute long enough for _M_ver to wrap) we might
want to always wait with a timeout, so that we get a chance to re-check
the predicate even in the rare case that _M_ver wraps.
Another change is to make the __wait_until_impl function take a
__wait_clock_t::duration instead of a __wait_clock_t::time_point, so
that the __wait_until_impl function doesn't depend on the symbol name of
chrono::steady_clock. Inside the library it can be converted back to a
time_point for the clock. This would potentially allow using a different
clock, if we made a different __abi_version in the __wait_args imply
waiting with a different clock.
This also adds a void* to the __wait_args_base structure, so that
__wait_impl can store the __waitable_state* in there the first time it's
looked up for a given wait, so that it doesn't need to be retrieved
again on each loop. This requires passing the __wait_args_base structure
by non-const reference.
The __waitable_state::_S_track function can be removed now that it's all
internal to the library, and namespace-scope RAII types added for
locking and tracking contention.
libstdc++-v3/ChangeLog:
* config/abi/pre/gnu.ver: Add new symbol version and exports.
* include/bits/atomic_timed_wait.h (__platform_wait_until): Move
to atomic.cc.
(__cond_wait_until, __spin_until_impl): Likewise.
(__wait_until_impl): Likewise. Change __wait_args_base parameter
to non-const reference and change third parameter to
__wait_clock_t::duration.
(__wait_until): Change __wait_args_base parameter to non-const
reference. Change Call time_since_epoch() to get duration from
time_point.
(__wait_for): Change __wait_args_base parameter to non-const
reference.
(__atomic_wait_address_until): Call _M_prep_for_wait_on on args.
(__atomic_wait_address_for): Likewise.
(__atomic_wait_address_until_v): Qualify call to avoid ADL. Do
not forward __vfn.
* include/bits/atomic_wait.h (__platform_wait_uses_type): Use
alignof(T) not alignof(T*).
(__futex_wait_flags, __platform_wait, __platform_notify)
(__waitable_state, __spin_impl, __notify_impl): Move to
atomic.cc.
(__wait_impl): Likewise. Change __wait_args_base parameter to
non-const reference.
(__wait_args_base::_M_wait_state): New data member.
(__wait_args_base::_M_prep_for_wait_on): New member function.
(__wait_args_base::_M_load_proxy_wait_val): New member
function.
(__wait_args_base::_S_memory_order_for): Remove member function.
(__atomic_wait_address): Call _M_prep_for_wait_on on args.
(__atomic_wait_address_v): Qualify call to avoid ADL.
* src/c++20/Makefile.am: Add new file.
* src/c++20/Makefile.in: Regenerate.
* src/c++20/atomic.cc: New file.
* testsuite/17_intro/headers/c++1998/49745.cc: Remove XFAIL for
C++20 and later.
* testsuite/29_atomics/atomic/wait_notify/100334.cc: Remove use
of internal implementation details.
* testsuite/util/testsuite_abi.cc: Add GLIBCXX_3.4.35 version.