atomic: fix load and store for armv7 and higher

System Internals / PulseAudio - Thomas Hutschenreuther [] - 11 June 2019 16:04 EDT

The original atomic implementation in pulseaudio based on libatomic stated that the intent was to use full memory barriers.

According to [1], the load and store implementation based on gcc builtins matches sequential consistent (i.e. full memory barrier) load and store ordering only for x86.

I observed random crashes in client applications using memfd srbchannel transport on an armv8-aarch64 platform (cortex-a57). In all those crashes the first read on the pstream descriptor (the size field) was wrong and looked like it contained old data. I boiled the relevant parts of the srbchannel implementation down to a simple test case and could observe random test failures. So I figured that the atomic implementation was broken for armv8 with respect to cross-cpu memory access ordering consistency.

In order to come up with a minimal fix, I used the newer __atomic_load_n/__atomic_store_n builtins from gcc.

With aarch64-linux-gnu-gcc (Linaro GCC 7.3-2018.05) 7.3.1 20180425 they compile to ldar and stlxr on arm64, which is correct according to [1] and [2].

The other atomic operations based on __sync builtins don't need to be touched since they already are of the full memory barrier

[1] [2]

d4ff4adce atomic: fix load and store for armv7 and higher | 12 +++++ | 11 ++++
src/ | 8 ++-
src/pulsecore/atomic.h | 33 ++++++++++++
src/tests/atomic-test.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 198 insertions(+), 1 deletion(-)


