Skip to content

Commit e98affa

Browse files
wathsalavtmonjalo
authored andcommitted
ring: establish safe partial ordering in default mode
The function __rte_ring_headtail_move_head() assumes that the barrier (fence) between the load of the head and the load-acquire of the opposing tail guarantees the following: if a first thread reads tail and then writes head and a second thread reads the new value of head and then reads tail, then it should observe the same (or a later) value of tail. This assumption is incorrect under the C11 memory model. If the barrier (fence) is intended to establish a total ordering of ring operations, it fails to do so. Instead, the current implementation only enforces a partial ordering, which can lead to unsafe interleavings. In particular, some partial orders can cause underflows in free slot or available element computations, potentially resulting in data corruption. The issue manifests when a CPU first acts as a producer and later as a consumer. In this scenario, the barrier assumption may fail when another core takes the consumer role. A Herd7 litmus test in C11 can demonstrate this violation. The problem has not been widely observed so far because: (a) on strong memory models (e.g., x86-64) the assumption holds, and (b) on relaxed models with RCsc semantics the ordering is still strong enough to prevent hazards. The problem becomes visible only on weaker models, when load-acquire is implemented with RCpc semantics (e.g. some AArch64 CPUs which support the LDAPR and LDAPUR instructions). Three possible solutions exist: 1. Strengthen ordering by upgrading release/acquire semantics to sequential consistency. This requires using seq-cst for stores, loads, and CAS operations. However, this approach introduces a significant performance penalty on relaxed-memory architectures. 2. Establish a safe partial order by enforcing a pair-wise happens-before relationship between thread of same role by changing the CAS and the preceding load of the head by converting them to release and acquire respectively. This approach makes the original barrier assumption unnecessary and allows its removal. 3. Retain partial ordering but ensure only safe partial orders are committed. This can be done by detecting underflow conditions (producer < consumer) and quashing the update in such cases. This approach makes the original barrier assumption unnecessary and allows its removal. This patch implements solution (2) to preserve the “enqueue always succeeds” contract expected by dependent libraries (e.g., mempool). While solution (3) offers higher performance, adopting it now would break that assumption. Fixes: 49594a6 ("ring/c11: relax ordering for load and store of the head") Cc: [email protected] Signed-off-by: Wathsala Vithanage <[email protected]> Signed-off-by: Ola Liljedahl <[email protected]> Reviewed-by: Honnappa Nagarahalli <[email protected]> Reviewed-by: Dhruv Tripathi <[email protected]> Acked-by: Konstantin Ananyev <[email protected]> Tested-by: Konstantin Ananyev <[email protected]>
1 parent 8357af1 commit e98affa

File tree

1 file changed

+29
-8
lines changed

1 file changed

+29
-8
lines changed

lib/ring/rte_ring_c11_pvt.h

Lines changed: 29 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,11 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
3636
rte_wait_until_equal_32((uint32_t *)(uintptr_t)&ht->tail, old_val,
3737
rte_memory_order_relaxed);
3838

39+
/*
40+
* R0: Establishes a synchronizing edge with load-acquire of tail at A1.
41+
* Ensures that memory effects by this thread on ring elements array
42+
* is observed by a different thread of the other type.
43+
*/
3944
rte_atomic_store_explicit(&ht->tail, new_val, rte_memory_order_release);
4045
}
4146

@@ -77,17 +82,24 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
7782
int success;
7883
unsigned int max = n;
7984

85+
/*
86+
* A0: Establishes a synchronizing edge with R1.
87+
* Ensure that this thread observes same values
88+
* to stail observed by the thread that updated
89+
* d->head.
90+
* If not, an unsafe partial order may ensue.
91+
*/
8092
*old_head = rte_atomic_load_explicit(&d->head,
81-
rte_memory_order_relaxed);
93+
rte_memory_order_acquire);
8294
do {
8395
/* Reset n to the initial burst count */
8496
n = max;
8597

86-
/* Ensure the head is read before tail */
87-
rte_atomic_thread_fence(rte_memory_order_acquire);
88-
89-
/* load-acquire synchronize with store-release of ht->tail
90-
* in update_tail.
98+
/*
99+
* A1: Establishes a synchronizing edge with R0.
100+
* Ensures that other thread's memory effects on
101+
* ring elements array is observed by the time
102+
* this thread observes its tail update.
91103
*/
92104
stail = rte_atomic_load_explicit(&s->tail,
93105
rte_memory_order_acquire);
@@ -113,10 +125,19 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
113125
success = 1;
114126
} else
115127
/* on failure, *old_head is updated */
128+
/*
129+
* R1/A2.
130+
* R1: Establishes a synchronizing edge with A0 of a
131+
* different thread.
132+
* A2: Establishes a synchronizing edge with R1 of a
133+
* different thread to observe same value for stail
134+
* observed by that thread on CAS failure (to retry
135+
* with an updated *old_head).
136+
*/
116137
success = rte_atomic_compare_exchange_strong_explicit(
117138
&d->head, old_head, *new_head,
118-
rte_memory_order_relaxed,
119-
rte_memory_order_relaxed);
139+
rte_memory_order_release,
140+
rte_memory_order_acquire);
120141
} while (unlikely(success == 0));
121142
return n;
122143
}

0 commit comments

Comments
 (0)