Skip to content

Performance regression in 1.84 for inline-able function in closure #137273

Open
@findepi

Description

@findepi

Code

I tried this code:

#[macro_use]
extern crate criterion;

use criterion::Criterion;
use rand::Rng;

// #[inline(never)]
pub fn sum_all(v: &[i32]) -> i32 {
    v.iter().fold(0i32, |a, b| curried_add(a)(*b))
}
fn curried_add(a: i32) -> Box<dyn Fn(i32) -> i32> {
    Box::new(move |b| a + b)
}

fn criterion_benchmark(c: &mut Criterion) {
    let mut group = c.benchmark_group("bg");
    let vals = generate_array(100_000);
    group.bench_function("sum", |b| {
        b.iter(|| {
            let sum = sum_all(criterion::black_box(&vals));
            criterion::black_box(sum);
        })
    });
}

fn generate_array(len: usize) -> Vec<i32> {
    let mut rng = rand::rng();
    (0..len).map(|_| rng.random_range(0..100)).collect()
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

I expected to see this happen: ~26 µs reported by cargo bench

Instead, this happened: ~1.4 ms reported by cargo bench

Version it worked on

It most recently worked on: 1.83.0

$ cargo clean; caffeinate -sd cargo +1.84.1 bench -q -- --save-baseline 1.84.1 && caffeinate -sd cargo +1.83.0 bench -q -- --save-baseline 1.83.0
     Removed 1043 files, 228.6MiB total
Benchmarking bg/sum: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.7s, enable flat sampling, or reduce sample count to 60.
bg/sum                  time:   [1.3761 ms 1.4096 ms 1.4473 ms]
Found 17 outliers among 100 measurements (17.00%)
  1 (1.00%) low severe
  16 (16.00%) high severe

bg/sum                  time:   [26.609 µs 26.640 µs 26.679 µs]
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) high mild
  5 (5.00%) high severe

Version with regression

rustc --version --verbose:

rustc 1.84.1 (e71f9a9a9 2025-01-27)
binary: rustc
commit-hash: e71f9a9a98b0faf423844bf0ba7438f29dc27d58
commit-date: 2025-01-27
host: aarch64-apple-darwin
release: 1.84.1
LLVM version: 19.1.5

Backtrace

no compiler crash

#inline

If I uncomment #[inline(never)] line, the performance is back to normal under 1.84.

bisecting output

searched nightlies: from nightly-2024-10-13 to nightly-2024-11-22
regressed nightly: nightly-2024-11-04
searched commit range: b3f75cc...b8c8287
regressed commit: e3a918e

bisected with cargo-bisect-rustc v0.6.9

Host triple: aarch64-apple-darwin
Reproduce with:

cargo bisect-rustc --start 1.83.0 --end 1.84.1 --prompt -- bench -q

e3a918e comes from #132542

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchP-mediumMedium priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.regression-untriagedUntriaged performance or correctness regression.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions