Conversation
This comment was marked as outdated.
This comment was marked as outdated.
916240a to
1e64cdf
Compare
|
@tarcieri |
|
Yes |
As discussed in #824, adds a crate with the intent of it providing heuristics for selecting the optimal word size to use for a particular target CPU, which may differ from its address size. It's implemented as `macro_rules` to avoid a build script. Currently there's no ability to override it, though we could consider adding something like `cfg(cpubits = "64")`.
|
Alright, finally got around to this after so many years. I decided against going the build script route, at least for now, and implemented it entirely in terms of It has support for deciding between 16-bit, 32-bit, and 64-bit word sizes, with the ability to group 16-bit and 32-bit together, which is how e.g. It has support for specifying a |
| #[enable_64bit( | ||
| // `cfg` selector for 64-bit targets (implicitly `any`) | ||
| target_family = "wasm", | ||
| )] |
There was a problem hiding this comment.
Here's the "heuristic" part. This just includes the WASM target family for now, but I know we've had others requested in the past (e.g. ARMv7). See also: RustCrypto/crypto-bigint#973
Ideally I think we'd use a benchmark-driven approach to decide which targets go here. The somewhat annoying part is adding anything here is effectively a breaking change.
There was a problem hiding this comment.
@jrose-signal random ping but long ago I think you had suggested trying 64-bit codegen on ARMv7. Am I remembering correctly?
There was a problem hiding this comment.
Yes, we've been using 64-bit codegen for 32-bit Android for years, it performed better for curve25519-dalek on…well, at least one phone, I'm sure it's configuration-dependent in practice.
There was a problem hiding this comment.
Opened a PR here, and hopefully gave a decent enough rationale. Thanks for confirming!
ARMv7 is one of the main architectures for which we've received requests for 64-bit overrides in the past (see discussion on #826). Though natively 32-bit, ARMv7 supports certain "doubleword" instructions which model 64-bit values as a pair of 32-bit registers, e.g. `ADDS`/`ADC` and `SUBS`/`SBC` for 2x32-bit addition/subtraction, as well as `UMULL`/`SMULL` for widening multiplication with 64-bit outputs. Many ARMv7 CPUs internally fetch 64-bits of instruction at once and can move 64-bits of data via `LDRD`/`STRD` in one cycle on optimized paths. Some high-performance ARMv7 CPUs internally combine the barrel shifter + ALU to speed multi-word shifts. If we use 64-bit implementations when targeting ARMv7, codegen is able to leverage these optimizations.
ARMv7 is one of the main architectures for which we've received requests for 64-bit overrides in the past (see discussion on #826). Though natively 32-bit, ARMv7 supports certain "doubleword" instructions which model 64-bit values as a pair of 32-bit registers, e.g. `ADDS`/`ADC` and `SUBS`/`SBC` for 2x32-bit addition/subtraction, as well as `UMULL`/`SMULL` for widening multiplication with 2x32-bit outputs. Many ARMv7 CPUs internally fetch 64-bits of instruction at once and can move 64-bits of data via `LDRD`/`STRD` in one cycle on optimized paths. Some high-performance ARMv7 CPUs internally combine the barrel shifter + ALU to speed multi-word shifts. If we use 64-bit implementations when targeting ARMv7, codegen is able to leverage these optimizations.
As discussed in #824, adds a crate with the intent of it providing heuristics for selecting whether 32-bit or 64-bit backends have optimal codegen for a given target, with optional overrides.
The intended use of this crate is in
build-dependencies, where it can emit acfgattribute (e.g.--cfg cpubits="64") if one hasn't been explicitly specified already, and all gating on 32-bit vs 64-bit backends can simply use thecfgattribute.