Skip to content

Commit e3ca2bb

Browse files
jqnatividadclaude
andcommitted
perf(frequency): hint UTF-8 failure as cold in ignore-case hot loop
In `ftables_weighted_internal` and `ftables_unweighted`, the per-cell `process_field` closures take an `if let Ok(s) = simdutf8::basic::from_utf8(field)` branch where the `Ok` arm dominates on real data and the `Err` arm is rare. Mark the four `else` arms with `core::hint::cold_path()` so LLVM keeps the hot UTF-8-success path contiguous in the instruction cache. Benchmark on a 1M-row NYC 311 CSV (514 MB, 41 cols), hyperfine, 10 runs: qsv frequency --ignore-case baseline 4.399 ± 0.045 s coldpath 4.139 ± 0.093 s → 1.06× faster qsv frequency --ignore-case --no-trim baseline 4.089 ± 0.090 s coldpath 4.053 ± 0.036 s → noise qsv frequency (default, cache short-circuit) baseline 1.880 ± 0.028 s coldpath 1.864 ± 0.015 s → noise (paths not exercised) Outputs identical between builds. The 6% gain is concentrated on the trim + ignore-case path because that hot body (lowercase + extend_from_slice + add_borrowed) is the largest of the closure variants, so isolating its icache layout has the most leverage. MSRV 1.95 ≥ cold_path stabilization (1.92). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent da1202c commit e3ca2bb

1 file changed

Lines changed: 5 additions & 0 deletions

File tree

src/cmd/frequency.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -240,6 +240,7 @@ Common options:
240240
CSV into memory using CONSERVATIVE heuristics.
241241
"#;
242242

243+
use core::hint::cold_path;
243244
use std::{fs, io, str::FromStr, sync::OnceLock};
244245

245246
use crossbeam_channel;
@@ -2711,6 +2712,7 @@ impl Args {
27112712
field_buffer.extend_from_slice(string_buf.as_bytes());
27122713
weighted_add(map, field_buffer, weight);
27132714
} else {
2715+
cold_path();
27142716
weighted_add(map, field, weight);
27152717
}
27162718
}
@@ -2723,6 +2725,7 @@ impl Args {
27232725
field_buffer.extend_from_slice(string_buf.as_bytes());
27242726
weighted_add(map, field_buffer, weight);
27252727
} else {
2728+
cold_path();
27262729
weighted_add(map, trim_bs_whitespace(field), weight);
27272730
}
27282731
}
@@ -2858,6 +2861,7 @@ impl Args {
28582861
field_buffer.extend_from_slice(string_buf.as_bytes());
28592862
ftab.add_borrowed(field_buffer);
28602863
} else {
2864+
cold_path();
28612865
ftab.add_borrowed(field);
28622866
}
28632867
}
@@ -2870,6 +2874,7 @@ impl Args {
28702874
field_buffer.extend_from_slice(string_buf.as_bytes());
28712875
ftab.add_borrowed(field_buffer);
28722876
} else {
2877+
cold_path();
28732878
ftab.add_borrowed(trim_bs_whitespace(field));
28742879
}
28752880
}

0 commit comments

Comments
 (0)