LUT-free Mini-Float Upcasting by ashvardanian · Pull Request #310 · ashvardanian/NumKong

ashvardanian · 2026-03-17T19:18:40Z

This patch explores the idea of LUT-free arithmetic upcasts for mini-float representations into F32. The current results look inconclusive and should be later investigated together with a GEMM redesign to hoist the upcasts & fuse them with loads.

Closes #328 On x86 Sapphire Rapids cores - denormal values choke FMA: - F32 slows from ~3cy to ~86cy (28x slower) - F64 slows from ~3cy to ~104cy (34x slower) That's why previous versions of this library suggested changing FTX/DAZ settings to ensure high throughput. We now always use mixed-precision schemes that are safe even against such inputs. All of F32 denormals become normal in F64 accumulators and it's a similar story for other inputs.

With denormal f32 intermediates now safe, replace the two-path integer-add + subnormal-LUT upcast with Giesen's magic-multiply trick across all backends (Haswell, Skylake, NEON, RVV, WASM, serial). A single float multiply — reinterpret magnitude bits as a tiny f32, multiply by 2^(127-bias) — correctly handles zero, subnormals, and normals in one instruction, eliminating per-format LUT constants and subnormal mask+blend logic (-169 lines net).

Improve: LUT-free mini-float upcasts

479ec5c

This patch explores the idea of LUT-free arithmetic upcasts for mini-float representations into F32. The current results look inconclusive and should be later investigated together with a GEMM redesign to hoist the upcasts & fuse them with loads.

ashvardanian self-assigned this Mar 17, 2026

ashvardanian mentioned this pull request Mar 23, 2026

Bug: import numkong sets FTZ/DAZ globally, breaking IEEE-754 subnormal floats #328

Closed

3 tasks

ashvardanian added 3 commits March 23, 2026 12:46

Merge branch 'main-dev' into lut-free-casts

659b66b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LUT-free Mini-Float Upcasting#310

LUT-free Mini-Float Upcasting#310
ashvardanian wants to merge 4 commits into
main-devfrom
lut-free-casts

ashvardanian commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ashvardanian commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant