First and Second Moment Quantization by shadygm · Pull Request #1229 · MrNeRF/LichtFeld-Studio

shadygm · 2026-05-19T20:59:37Z

No description provided.

Copilot

Pull request overview

This PR adds quantized Adam first/second moment storage (uint8 + per-row scales) and updates the FastGS fused-Adam CUDA path to consume/update those quantized moments, reducing optimizer-state memory bandwidth/footprint while keeping the parameter updates on GPU.

Changes:

Introduces quantized-moment Adam kernels/APIs (row-wise uint8 moments + per-row scale factors), plus utilities to quantize existing float moments and to zero quantized rows.
Updates FastGS rasterization backward kernels to perform fused Adam updates via a row-wise dynamic update helper that supports quantized moments.
Extends optimizer state serialization format (version bump) to persist quantized moments + scales and supports loading legacy (v1) float moments by quantizing on load.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
src/training/rasterization/fastgs/rasterization/src/backward.cu	Fixes invisible-Adam launch dimensions to operate per “row” (primitive) instead of per-element for multi-attribute params.
src/training/rasterization/fastgs/rasterization/include/kernels_backward.cuh	Switches multiple per-component Adam updates to row-wise updates and refactors invisible-step kernel to update entire rows (supports quantized path).
src/training/rasterization/fastgs/rasterization/include/kernel_utils.cuh	Adds quantize/dequant helpers and `adam_step_row_dynamic` supporting float and quantized moments; batches SH gradient updates into a row-wise call.
src/training/rasterization/fastgs/rasterization/include/fused_adam_types.h	Extends `FusedAdamParam` with quantized moment buffers + per-row scales and a `quantized` flag.
src/training/rasterization/fastgs/optimizer/src/adam.cu	Adds a quantized Adam entrypoint and kernel launch error reporting.
src/training/rasterization/fastgs/optimizer/src/adam_api.cu	Adds raw CUDA API for quantized Adam, moment quantization, and zeroing quantized rows.
src/training/rasterization/fastgs/optimizer/include/adam.h	Declares the quantized Adam API.
src/training/rasterization/fastgs/optimizer/include/adam_kernels.cuh	Implements quantized Adam step, float→quantized moment conversion, and zeroing quantized rows kernels.
src/training/rasterization/fastgs/optimizer/include/adam_api.h	Exposes raw APIs for quantized step, quantization, and zeroing quantized rows.
src/training/rasterization/fast_rasterizer.cpp	Plumbs new fused-Adam quantized pointers/scales into rasterization backward.
src/training/optimizer/adam_optimizer.hpp	Changes optimizer state moment tensors to quantized (UInt8) + adds scale tensors; extends fused param struct for quantized buffers.
src/training/optimizer/adam_optimizer.cpp	Allocates/maintains quantized moment state, performs quantized step, adds checkpoint v2 format with backward-compat conversion from v1 float moments.
src/core/tensor/tensor_masking_ops.cpp	Extends `append_gather` CUDA path to support UInt8/Bool tensors (needed by quantized optimizer state growth/gather paths).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        lfs::core::Tensor exp_avg;    // Quantized first moment (m), uint8
+        lfs::core::Tensor exp_avg_sq; // Quantized second moment (v), uint8
+        lfs::core::Tensor exp_avg_scale;
+        lfs::core::Tensor exp_avg_sq_scale;


shadygm · 2026-05-27T20:33:48Z

Closing since the new swizzle SH layout saved a bit more vram, will look into it in the future.

Feat: 8-bit quant

7203f61

Copilot AI review requested due to automatic review settings May 19, 2026 20:59

Copilot started reviewing on behalf of shadygm May 19, 2026 20:59 View session

Copilot AI reviewed May 19, 2026

View reviewed changes

Comment thread src/training/optimizer/adam_optimizer.hpp

Comment on lines +45 to +48

lfs::core::Tensor exp_avg; // Quantized first moment (m), uint8

lfs::core::Tensor exp_avg_sq; // Quantized second moment (v), uint8

lfs::core::Tensor exp_avg_scale;

lfs::core::Tensor exp_avg_sq_scale;

shadygm closed this May 27, 2026

MrNeRF deleted the feat/quantization-8 branch May 31, 2026 14:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First and Second Moment Quantization#1229

First and Second Moment Quantization#1229
shadygm wants to merge 1 commit into
masterfrom
feat/quantization-8

shadygm commented May 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

shadygm commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shadygm commented May 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

shadygm commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants