Skip to content

Commit 126d4e2

Browse files
vipinamdferruhy
authored andcommitted
app/testpmd: interleave SSE SIMD
Interleaving SSE SIMD load, shuffle, and store, helps to improve the overall mac-swapp Mpps for both RX and TX. Test Result: * Platform: AMD EPYC 9554 @3.1GHz, no boost * Test scenarios: TEST-PMD 64B IO vs MAC-SWAP * NIC: broadcom P2100: loopback 2*100Gbps <mode : Mpps Ingress: Mpps Egress> ------------------------------------------------ - MAC-SWAP original: 45.75 : 43.8 - MAC-SWAP register mod: 45.73 : 44.83 - MAC-SWAP register+ofl mod: 46.36 : 44.79 - MAC-SWAP register+ofl+interleave mod: 46.0 : 45.1 Signed-off-by: Vipin Varghese <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
1 parent 87e2509 commit 126d4e2

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

app/test-pmd/macswap_sse.h

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -52,23 +52,25 @@ do_macswap(struct rte_mbuf *pkts[], uint16_t nb,
5252
addr1 = _mm_loadu_si128((__m128i *)eth_hdr[1]);
5353
mbuf_field_set(mb[1], ol_flags);
5454

55+
addr0 = _mm_shuffle_epi8(addr0, shfl_msk);
56+
5557
mb[2] = pkts[i++];
5658
eth_hdr[2] = rte_pktmbuf_mtod(mb[2], struct rte_ether_hdr *);
5759
addr2 = _mm_loadu_si128((__m128i *)eth_hdr[2]);
5860
mbuf_field_set(mb[2], ol_flags);
5961

62+
addr1 = _mm_shuffle_epi8(addr1, shfl_msk);
63+
_mm_storeu_si128((__m128i *)eth_hdr[0], addr0);
64+
6065
mb[3] = pkts[i++];
6166
eth_hdr[3] = rte_pktmbuf_mtod(mb[3], struct rte_ether_hdr *);
6267
addr3 = _mm_loadu_si128((__m128i *)eth_hdr[3]);
6368
mbuf_field_set(mb[3], ol_flags);
6469

65-
addr0 = _mm_shuffle_epi8(addr0, shfl_msk);
66-
addr1 = _mm_shuffle_epi8(addr1, shfl_msk);
6770
addr2 = _mm_shuffle_epi8(addr2, shfl_msk);
68-
addr3 = _mm_shuffle_epi8(addr3, shfl_msk);
69-
70-
_mm_storeu_si128((__m128i *)eth_hdr[0], addr0);
7171
_mm_storeu_si128((__m128i *)eth_hdr[1], addr1);
72+
73+
addr3 = _mm_shuffle_epi8(addr3, shfl_msk);
7274
_mm_storeu_si128((__m128i *)eth_hdr[2], addr2);
7375
_mm_storeu_si128((__m128i *)eth_hdr[3], addr3);
7476

0 commit comments

Comments
 (0)