Skip to content

i32.clamp() suggested by Clippy produces worse code than i32.min().max() #141915

Open
@Shnatsel

Description

@Shnatsel

On this code in image-webp, following the Clippy lint to replace .max(0).min(255) with .clamp(0,255) on an i32 value causes a performance regression:

https://github.com/image-rs/image-webp/blob/93baf7de7df50977a1fcb3a0bb53036d4780bff3/src/vp8.rs#L994-L999

It's unfortunate that .min().max() and .clamp() are not equivalent, and doubly so when Clippy nags us to rewrite the code in a way that makes it slower.

I've posted a self-contained sample that reproduces the issue on godbolt:

Generated assembly for .min().max(): https://rust.godbolt.org/z/zr7PK8vz3
Generated assembly for .clamp(): https://rust.godbolt.org/z/b898M45vo

You can see that the .clamp() version results in far more assembly; the vectorized loop is roughly twice the amount of instructions.

I've confirmed that the issue exists in rustc 1.75, 1.82 and 1.87 which is the latest as of this writing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions