Skip to content

[Compressors] Remove sparse compression#2452

Open
kylesayrs wants to merge 3 commits intomainfrom
kylesayrs/remove-sparse-compression
Open

[Compressors] Remove sparse compression#2452
kylesayrs wants to merge 3 commits intomainfrom
kylesayrs/remove-sparse-compression

Conversation

@kylesayrs
Copy link
Collaborator

@kylesayrs kylesayrs commented Mar 7, 2026

Purpose

  • Remove sparsity tests which fail now that sparse compression is no longer supported

Corequisites

Testing

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@github-actions
Copy link

github-actions bot commented Mar 7, 2026

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request deprecates and removes all support for sparse24 compression within the llmcompressor library. The changes involve removing the sparse24 compression logic, refactoring the compress_module function to a more centralized location, and updating the model compression utilities to reflect that sparse compression is no longer supported. This streamlines the codebase by removing an unsupported feature and simplifies the compression pipeline.

Highlights

  • Removed Sparse24 Compression: The sparse24 compression scheme and all related logic, including the compress_module function, have been entirely removed from the codebase.
  • Refactored Compression Utility: The compress_module function was moved from llmcompressor.entrypoints.model_free.lifecycle to compressed_tensors.compressors, centralizing its definition.
  • Updated Model Compressor: The get_model_compressor utility now explicitly warns that sparse compression is no longer supported by compressed-tensors and has removed all logic for handling sparsity configurations.
  • Streamlined Testing: Extensive tests related to sparse24 compression, sparse model reloading, and compressor stacking have been removed or simplified to reflect the deprecation of sparse compression.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/llmcompressor/entrypoints/model_free/lifecycle.py
    • Removed BaseCompressor and _get_quant_compression_format imports.
    • Removed compress_module from the module's __all__ export list.
    • Deleted the compress_module function definition.
  • src/llmcompressor/entrypoints/model_free/process.py
    • Added compress_module import from compressed_tensors.compressors.
    • Removed compress_module import from llmcompressor.entrypoints.model_free.lifecycle.
  • src/llmcompressor/transformers/compression/compressed_tensors_utils.py
    • Removed get_state_dict_offloaded_model and SparsityConfigMetadata imports.
    • Modified get_model_compressor to remove all sparse compression handling logic and issue a warning.
    • Removed sparsity_config_or_format argument from ModelCompressor.from_pretrained_model calls.
  • tests/llmcompressor/transformers/compression/test_compress_tensor_utils.py
    • Removed imports related to sparse compression (math, ModelCompressor, CompressionFormat, BitmaskConfig, DenseSparsityConfig).
    • Added infer_set_module_formats import.
    • Deleted test_sparse_model_reload, test_dense_model_save, test_compressor_stacking, test_sparse_24_compressor_is_lossless, and test_disable_sparse_compression_flag test functions.
    • Simplified test_quant_model_reload parametrization and logic.
    • Removed _make_24_sparse helper function.
    • Updated DummyLinearModel initialization for weight_scale and zero_point.
    • Refactored test_correct_compressor_inferred to remove sparse compression checks and use infer_set_module_formats.
Activity
  • No activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes support for sparse24 compression, which is a significant change. The modifications are consistent across the codebase, including removing logic from the main library and updating/removing relevant tests. The changes are well-aligned with the goal of deprecating this feature. I've found a minor issue in the test suite that could be improved for clarity.

Comment on lines 241 to 257
@pytest.mark.parametrize(
"quant_style, quant_type, is_24, expected_quant_compressor, "
"expected_sparsity_compressor",
"quant_style,quant_type,is_24,expected_format",
[
("W8A8", "int", False, "int-quantized", "dense"),
("W4A16", "int", False, "pack-quantized", "dense"),
("W8A16", "int", False, "pack-quantized", "dense"),
("W8A8", "int", True, "int-quantized", "sparse-24-bitmask"),
("W4A16", "int", True, "marlin-24", "dense"),
("W8A16", "int", True, "marlin-24", "dense"),
("W8A8", "float", False, "float-quantized", "dense"),
("W8A16", "float", False, "naive-quantized", "dense"),
("W8A8", "float", True, "float-quantized", "sparse-24-bitmask"),
("W8A16", "float", True, "naive-quantized", "dense"),
("W8A8", "int", False, "int-quantized"),
("W4A16", "int", False, "pack-quantized"),
("W8A16", "int", False, "pack-quantized"),
("W8A8", "float", False, "float-quantized"),
("W8A16", "float", False, "naive-quantized"),
("W8A16", "float", True, "naive-quantized"),
],
)
def test_correct_compressor_inferred(
quant_style,
quant_type,
is_24,
expected_quant_compressor,
expected_sparsity_compressor,
expected_format,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The is_24 parameter appears to be a leftover from refactoring and is no longer used within the test body. This can be confusing for future readers of the code. To improve clarity and maintainability, it should be removed from the test function signature and the pytest.mark.parametrize decorator. The test cases should also be updated to remove the now-redundant parameter and a duplicate test case.

@pytest.mark.parametrize(
    "quant_style,quant_type,expected_format",
    [
        ("W8A8", "int", "int-quantized"),
        ("W4A16", "int", "pack-quantized"),
        ("W8A16", "int", "pack-quantized"),
        ("W8A8", "float", "float-quantized"),
        ("W8A16", "float", "naive-quantized"),
    ],
)
def test_correct_compressor_inferred(
    quant_style,
    quant_type,
    expected_format,
):

@kylesayrs kylesayrs changed the title [Compressors] Remove sparse24 compression [Compressors] Remove sparse compression Mar 7, 2026
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹

@mergify
Copy link
Contributor

mergify bot commented Mar 10, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @kylesayrs.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Mar 10, 2026
test_type: "regression"
compressed_model_stub: "nm-testing/tinyllama-fp8-dynamic-compressed"
skeleton_model_stub: "TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T" No newline at end of file
compressed_model_stub: "nm-testing/tinyllama-fp8-dynamic-compressed" No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these all showing up as line changes? Did you remove the new line?

Copy link
Collaborator

@HDCharles HDCharles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine though you should fix those newlines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants