The functions in util/atomic.* have a bunch of x86 assembly which is less than ideal on non-x86 hardware (like this 96 core arm64 I am playing with), and it would be nice to use some type of cross-platform atomic operations, either the ones provide in std::atomic, or a linux kernel vdso