feat: Enable deterministic LoRA activation reduction using fixed-point arithmetic #820

lantudou · 2025-12-17T13:01:21Z

Motivation

Floating-point atomic add operations (atomicAdd / red.add.f32) on GPUs are non-deterministic due to the unpredictable order of thread execution, causing slight variations in results. This non-deterministic behavior is problematic for scenarios requiring strict reproducibility, such as debugging, testing, and scientific computing.

This PR enables deterministic LoRA activation reduction by replacing floating-point reduction with fixed-point integer arithmetic, ensuring identical outputs for the same inputs across runs.

There also are some issues refer to this feature: #546 #229 #294
spooknik/nunchaku-chroma#6
Modifications

gemm_utils.cuh: Added an int overload of reduce_add_pred using red.relaxed.gpu.global.add.s32 for deterministic integer atomic addition
lora.cuh:
- Changed lora_act data type from float* to int*
- Implemented 16-bit fractional precision (FRAC_BITS = 16) fixed-point representation in reduce_lora_act, scaling float values to integers before reduction
- In apply_lora_up, dequantized fixed-point integers back to floats with the inverse scale factor fused into scales to avoid extra computation overhead
gemm_w4a4.cuh / gemm_w4a4_launch_impl.cuh: Updated relevant type declarations and pointer casts

…metic

lantudou closed this Dec 17, 2025

lantudou changed the title ~~Ensure deterministic LoRA activation reduction with fixed-point arithmetic~~ feat: Enable deterministic LoRA activation reduction using fixed-point arithmetic Dec 17, 2025

lantudou reopened this Dec 17, 2025

lantudou force-pushed the deterministic branch from 4efec5c to 07d2ca2 Compare December 24, 2025 07:24

Ensure deterministic LoRA activation reduction with fixed-point arith…

3713fa9

…metic

lantudou force-pushed the deterministic branch from 07d2ca2 to 3713fa9 Compare December 24, 2025 07:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Enable deterministic LoRA activation reduction using fixed-point arithmetic #820

feat: Enable deterministic LoRA activation reduction using fixed-point arithmetic #820

lantudou commented Dec 17, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Enable deterministic LoRA activation reduction using fixed-point arithmetic #820

Are you sure you want to change the base?

feat: Enable deterministic LoRA activation reduction using fixed-point arithmetic #820

Conversation

lantudou commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lantudou commented Dec 17, 2025 •

edited

Loading