Quantization using distribution of embeddings on pre-training dataset by favyen2 · Pull Request #477 · allenai/olmoearth_pretrain

favyen2 · 2026-02-02T22:49:48Z

Try quantizing to 8/4/2/1-bit using distribution of embeddings on pre-training dataset.

Since it generates the HDF5 file containing the quantiles in the format expected by the load_quantiles_config function.

cmwilhelm · 2026-02-19T19:30:06Z

+    return config
+
+
+def quantize_embeddings_percentile(


Did this percentile-based bucketing approach behave substantially different than the statistics-naive quantization scheme in https://github.com/allenai/olmoearth_run/blob/006496243c8f00ada3b74a77874e87a93bfa661e/src/olmoearth_run/runner/tools/postprocessors/combine_geotiff.py#L48?

I magine it's probably pretty important with the very low-bit quantizations, but do you have a sense at int8?

AFAIK neither approach showed any drop in performance at int8, here is Mike's results:
https://github.com/allenai/olmoearth_pretrain/blob/main/scripts/archived/2026-01-024_embedding_analysis/quant_comparison_rounded.csv

In the platform I think you should just go with the simple fixed quantization.

also I computed the per-band distribution here, but I found that all of the bands follow almost the same distribution

Normal distribution centered over 0.0? That would be convenient for int1 😅

Something like this:

Definitely centered at 0 so I do think for 1-bit you can just do < 0 vs >= 0. I didn't think it was normal but I guess it does look roughly normal.

Guess it doesn't need to be a normal distribution when there are only two quantiles.

yawenzzzz

LGTM! just a small question about the int8 vs. unit8

yawenzzzz · 2026-02-20T23:23:49Z

+            - "quantiles": torch.Tensor of shape (dim, num_buckets+1)
+            - "midpoints": torch.Tensor of shape (dim, num_buckets)


Suggestion: add a note say that quantiles are for quantization and midpoints are for dequantization

yawenzzzz · 2026-02-20T23:41:45Z

+
+    # Flatten to (N_total, dim)
+    # Convert to uint8 first to handle int8 wrap-around (128-255 stored as -128 to -1)
+    flat = quantized.reshape(-1, dim).to(torch.uint8).long()


I'm wondering why don't we just quantize to unit8, with this, there's no need to convert to uint8 in the dequantization

favyen2 added 5 commits January 29, 2026 10:44

Experiment with <8-bit quantization based on quantiles

2ff1c8a

add the quantiles.h5 that is needed

811c933

fix 8-bit implementation

eca84ad

get pr ready

f24d744

Move the compute_embedding_quantiles to scripts/tools/.

3389761

Since it generates the HDF5 file containing the quantiles in the format expected by the load_quantiles_config function.

github-actions Bot added the size/l label Feb 2, 2026

cmwilhelm reviewed Feb 19, 2026

View reviewed changes

yawenzzzz approved these changes Feb 20, 2026

View reviewed changes

2imi9 mentioned this pull request Mar 25, 2026

Add FP4/FP8 weight quantization for Blackwell/Hopper GPU inference #516

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantization using distribution of embeddings on pre-training dataset#477

Quantization using distribution of embeddings on pre-training dataset#477
favyen2 wants to merge 5 commits intomainfrom
favyen/20260129-quantization

favyen2 commented Feb 2, 2026

Uh oh!

cmwilhelm Feb 19, 2026

Uh oh!

favyen2 Feb 19, 2026 •

edited

Loading

Uh oh!

favyen2 Feb 19, 2026

Uh oh!

cmwilhelm Feb 19, 2026

Uh oh!

favyen2 Feb 19, 2026

Uh oh!

cmwilhelm Feb 19, 2026

Uh oh!

yawenzzzz left a comment

Uh oh!

yawenzzzz Feb 20, 2026

Uh oh!

yawenzzzz Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		- "quantiles": torch.Tensor of shape (dim, num_buckets+1)
		- "midpoints": torch.Tensor of shape (dim, num_buckets)

Conversation

favyen2 commented Feb 2, 2026

Uh oh!

cmwilhelm Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

favyen2 Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

favyen2 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

cmwilhelm Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

favyen2 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

cmwilhelm Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

yawenzzzz left a comment

Choose a reason for hiding this comment

Uh oh!

yawenzzzz Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

yawenzzzz Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

favyen2 Feb 19, 2026 •

edited

Loading