Skip to content

Conversation

@RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Jan 29, 2026

If we're just concatenating subvectors together to perform a saturated truncate, see if we can perform PACK on the subvectors directly instead - 256-bit PACK will require a post-shuffle, but this will typically fold away in later shuffle combining and its probably better than changing vector widths with concats.

Reference patch based off poor codegen identified in #169995

… -> PACKSS/US(X,Y) folds.

If we're just concatenating subvectors together to perform a saturated truncate, see if we can perform PACK on the subvectors directly instead - 256-bit PACK will require a post-shuffle, but this will typically fold away in later shuffle combining and its probably better than changing vector widths with concats.

Reference patch based off poor codegen identified in llvm#169995
@RKSimon
Copy link
Collaborator Author

RKSimon commented Jan 29, 2026

CC @folkertdev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant