Skip to content

Commit c8339a2

Browse files
committed
Add inline to some small functions used in sse impl
This speeds up the criteron benchmarks by almost 2x I believe this is needed because e.g. Bytes::find is inlined, and calls `find` generically, which will call PackedCompareControl methods. So the code calling the methods will be inlined into the calling crate, but the implemetations of the PackedCompareControl are not accessable to the code in the calling crate, so they will end up as actual function calls. However these functions are _super_ simple, and inlining them helps a LOT, so adding `#[inline]` to these functions, and making their implementation available to calling crates has a huge effect. This was only seen when moving to criterion because previously, nightly benchmarks were implemented in the library crate itself, and so these functions were already elegable for inlining. Criteron results were actually more accurate to what callers of the crate would actually see!
1 parent 7800819 commit c8339a2

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

src/simd.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -252,9 +252,12 @@ impl Bytes {
252252
}
253253

254254
impl<'b> PackedCompareControl for &'b Bytes {
255+
#[inline]
255256
fn needle(&self) -> __m128i {
256257
self.needle
257258
}
259+
260+
#[inline]
258261
fn needle_len(&self) -> i32 {
259262
self.needle_len
260263
}
@@ -312,9 +315,12 @@ impl<'a> ByteSubstring<'a> {
312315
}
313316

314317
impl<'a, 'b> PackedCompareControl for &'b ByteSubstring<'a> {
318+
#[inline]
315319
fn needle(&self) -> __m128i {
316320
self.needle
317321
}
322+
323+
#[inline]
318324
fn needle_len(&self) -> i32 {
319325
self.needle_len
320326
}

0 commit comments

Comments
 (0)