Remove `Simd: Index` #498

scottmcm · 2025-12-22T05:08:31Z

(Inspired by this conversation on Discord https://discord.com/channels/273534239310479360/592856094527848449/1452407229843247197)

This PR removes the Simd<_, _>: Index and Simd<_, _>: IndexMut trait implementations, and instead adds get and set methods.

IMHO this is a good idea as v[1] = 2; looks innocuous but is actually surprisingly bad for codegen, as it forces the vector to be in-memory in order to return the &T -- and there's no way to return a T from Index. One can always v.as_mut_array()[1] = 2; if you want (that still works) but that extra speed bump is, I posit, a good thing to help push people to better methods.

What are those better methods? Well, this also adds Simd::set and Simd::get using the simd_{insert,extract}_dyn intrinsics internally. Thus instead of let x = v[i];, you'd write let x = v.get(i);. Doing that solves the optimization problem that the person was asking about on discord, and seeing that it did fix it (and that there's no use of simd_extract_dyn in the current library) is why I went and made this PR 🙂

Discussion topics:

Should Index(Mut) be left there for a bit, to help smooth the transition? Can we even mark impls as deprecated?
What should the names be? I like set/get for shortness, but maybe something domain-specific would be better. I considered using insert/extract, but https://doc.rust-lang.org/std/simd/struct.Simd.html#method.extract already exists. (It could be extract_vector or something, though.)
Should set be &mut self or self? I noticed that https://doc.rust-lang.org/std/simd/struct.Mask.html#method.set is &mut self, but using &mut self also forces it through memory in codegen, albeit in a way much easier for the optimizer to remove.

workingjubilee · 2025-12-22T05:18:45Z

Oh, I thought we already added get and set? I guess I forgot.

scottmcm · 2025-12-22T05:23:29Z

@workingjubilee I was surprised too, TBH :P

I was expecting to say "oh, don't use as_mut_array, use set" and then it didn't exist...

calebzulawski · 2025-12-22T05:28:24Z

Looks good to me. I think the _dyn variations of the intrinsics are "new" which is why we didn't originally implement it (we did implement it for masks, since they can't implement Index). Implementing Index was always a bit controversial--I expected the optimizer to handle it but if not, removing it is probably best.

crates/core_simd/src/vector.rs

scottmcm · 2025-12-22T05:38:05Z

I think the _dyn variations of the intrinsics are "new"

They are, but helpfully they have fallback mir so they'll get working implementation for free, and that way so long as they've been in nightly long enough we don't need to worry about things like going and implementing them in cg_clif.

I expected the optimizer to handle it but if not

I'm surprised it doesn't, for what seems like a simple case, but overall I think discouraging address versions of things is good anyway -- we'll still have AsRef and AsMut and as_array and as_mut_array, so we're not blocking anything, just nudging the "wait, this is really not what simd is particularly good at" more strongly.

scottmcm · 2025-12-22T06:04:14Z

crates/core_simd/src/vector.rs

+    /// `idx` must be in-bounds (`idx < N`)
+    #[inline]
+    unsafe fn get_inner(self, idx: usize) -> T {
+        // FIXME: This is a workaround for a CI failure in #498


No idea what the problem was, but LLVM was very unhappy for some reason

rustc-LLVM ERROR: Cannot select: 0x7ff0c871e2a0: v16i8 = setcc 0x7ff0c871ee00, 0x7ff0ca14b000, setne:ch 0x7ff0c871ee00: v8i16,ch = CopyFromReg 0x7ff0c871e8c0:1, Register:v8i16 %21 0x7ff0ca14b000: v8i16 = xor 0x7ff0c8720a80, 0x7ff0c871db60 0x7ff0c8720a80: v8i16,ch = CopyFromReg 0x7ff0c871e070:1, Register:v8i16 %11 0x7ff0c871db60: v8i16 = splat_vector Constant:i32<-1>

DouglasDwyer · 2025-12-22T14:50:17Z

Thanks for following up on our Discord conversation! Is there any chance that Simd::get and Simd::set could be marked as const functions? (Perhaps this would use const_eval_select if the intrinsics are not callable at compile time, or maybe the intrinsics should be const too?)

programmerjake · 2025-12-22T18:32:00Z

Should set be &mut self or self? I noticed that https://doc.rust-lang.org/std/simd/struct.Mask.html#method.set is &mut self, but using &mut self also forces it through memory in codegen, albeit in a way much easier for the optimizer to remove.

imo &mut self is fine, we currently pass Simd through memory for all function arguments/returns anyway and llvm is capable of optimizing those out as long as the loads/stores are vector-typed.

sammysheep · 2025-12-22T18:48:21Z

I couldn't reach the discord for the discussion but we also use the indexing a bit. However, it's usually in a scalar context where we are preparing or inspecting a data structure before or after any hot loop work.

Messing with individual elements at a time always seemed to be for convenience not performance to me, but switching to get/set makes sense if this is a footgun.

Happy to benchmark this branch on Apple M4 if you need more data with a relevant function.

programmerjake · 2025-12-22T18:51:24Z

imo &mut self is fine, we currently pass Simd through memory for all function arguments/returns anyway and llvm is capable of optimizing those out as long as the loads/stores are vector-typed.

both passing by value and by &mut Simd produce almost identical LLVM IR: https://rust.godbolt.org/z/T9WdcWh33

workingjubilee · 2025-12-22T20:07:20Z

Should set be &mut self or self? I noticed that https://doc.rust-lang.org/std/simd/struct.Mask.html#method.set is &mut self, but using &mut self also forces it through memory in codegen, albeit in a way much easier for the optimizer to remove.

imo &mut self is fine, we currently pass Simd through memory for all function arguments/returns anyway and llvm is capable of optimizing those out as long as the loads/stores are vector-typed.

I would rather we not rely on this optimization pass, as I have been looking into compiler improvements which remove this requirement.

programmerjake · 2025-12-22T21:31:05Z

I would rather we not rely on this optimization pass, as I have been looking into compiler improvements which remove this requirement.

by that do you mean changing the function call ABI to pass values in registers, or changing rustc to not store temporary values in allocas, or something else? imo unless both of those changes are made, LLVM still has to optimize vector loads/stores into just SSA values so &mut self seems unlikely to be a problem

workingjubilee · 2026-01-13T18:59:48Z

Yes, both? Fucking hell.

workingjubilee · 2026-01-13T19:04:18Z

I don't see how anything that is "In My Opinion" matters anyways unless you have an opinion in the exact shape of certain optimization passes OR are willing to write an optimization pass in the shape of your opinions. We should be trying to get results not speculation.

programmerjake · 2026-01-13T19:23:31Z

Yes, both?

Nice! I wasn't expecting rustc to have a major overhaul in how it represents variables and stuff, since afaik it currently just generates an alloca for everything and relies on llvm to convert to ssa form. In that case, we should probably pass by value wherever we can.

workingjubilee · 2026-01-14T21:35:07Z

Ah. We actually do represent scalars "directly" in LLVMIR.

I was primarily looking to make the function-crossing of vectors be represented directly in LLVMIR (where ABI compatible). That was my main goal, as it enables optimizations on its own. Adjusting things to represent vectors directly in the intra-function IR is also possible, easier than it might sound, and kind of almost required once you have vectors pass directly.

Basically just a lot of things have not been thoroughly examined and are only a few small touches away. The main thing is just that previous attempts to improve things have not been thoughtful in how they adjusted, more like blind stabs in the dark. They wind up requiring reverts too often. I'm trying to at least get a candle, maybe a few, shining on the issue, so that I can make something stickier.

scottmcm force-pushed the no-index-trait branch 2 times, most recently from 78e6c6b to 5be2886 Compare December 22, 2025 05:13

scottmcm force-pushed the no-index-trait branch from 5be2886 to 9cdcec4 Compare December 22, 2025 05:22

scottmcm commented Dec 22, 2025

View reviewed changes

crates/core_simd/src/vector.rs Show resolved Hide resolved

scottmcm force-pushed the no-index-trait branch 2 times, most recently from a35167d to 72cc10d Compare December 22, 2025 05:32

scottmcm added 2 commits December 21, 2025 22:00

Remove Simd: Index

f88957f

Workaround something going wrong with get/set in wasm

3bbbc4f

scottmcm force-pushed the no-index-trait branch from 72cc10d to 3bbbc4f Compare December 22, 2025 06:00

scottmcm commented Dec 22, 2025

View reviewed changes

Remove Simd: Index #498

Are you sure you want to change the base?

Remove Simd: Index #498

Uh oh!

Conversation

scottmcm commented Dec 22, 2025

Uh oh!

workingjubilee commented Dec 22, 2025

Uh oh!

scottmcm commented Dec 22, 2025

Uh oh!

calebzulawski commented Dec 22, 2025

Uh oh!

Uh oh!

scottmcm commented Dec 22, 2025

Uh oh!

scottmcm Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

DouglasDwyer commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

programmerjake commented Dec 22, 2025

Uh oh!

sammysheep commented Dec 22, 2025

Uh oh!

programmerjake commented Dec 22, 2025

Uh oh!

workingjubilee commented Dec 22, 2025

Uh oh!

programmerjake commented Dec 22, 2025

Uh oh!

workingjubilee commented Jan 13, 2026

Uh oh!

workingjubilee commented Jan 13, 2026

Uh oh!

programmerjake commented Jan 13, 2026

Uh oh!

workingjubilee commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Remove `Simd: Index` #498

Remove `Simd: Index` #498

DouglasDwyer commented Dec 22, 2025 •

edited

Loading

workingjubilee commented Jan 14, 2026 •

edited

Loading