[Refactor] Refactor splits to only use the "calibration" split (#2551) by arpitkh101 · Pull Request #2589 · vllm-project/llm-compressor

arpitkh101 · 2026-04-08T18:29:57Z

Summary

Closes #2551
Simplifies the splits interface in get_processed_dataset by removing
multi-split dict handling in favour of a plain string argument.

Examples & Tests

Updated all examples and tests to use splits="train[:N]" string format.
Deleted test_dataset_helpers.py (helpers no longer exist).
Added new unit tests to test_dataset_loading.py:
- {"calibration": ...} dict backward compat
- Deprecation warning is emitted for dict input
- splits=None returns None (data-free flow)
- Invalid type raises ValueError

Before / After

# Before (deprecated, still works with warning)
oneshot(model, dataset="ultrachat", splits={"calibration": "train_sft[:512]"})
# After (recommended)
oneshot(model, dataset="ultrachat", splits="train_sft[:512]")

coderabbitai · 2026-04-08T18:30:13Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cb2a5bf0-9558-486a-aeab-73a0137395ca

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

Refactors dataset split handling from a multi-split/dict-based API (e.g., {"calibration": "..."}) to a single split string form (e.g., "train[:100]"), updating examples, tests, CLI help text, and core dataset utilities to accept and validate the new shape while preserving deprecated dict/list compatibility with warnings.

Changes

Cohort / File(s)	Summary
Core dataset API & utils `src/llmcompressor/args/dataset_arguments.py`, `src/llmcompressor/datasets/__init__.py`, `src/llmcompressor/datasets/utils.py`	Changed `DatasetArguments.splits` help text to recommend string selectors; removed re-export of `make_dataset_splits`; refactored `get_processed_dataset()` to return a single `Dataset
Tracing/CLI usage `src/llmcompressor/transformers/tracing/debug.py`	Adjusted dataset split usage in trace flow to expect a single split string (changed from `dataset_args.splits["calibration"]` to `dataset_args.splits`) and reorganized imports/argparse formatting (no semantic CLI behavior changes).
Example callsites `examples/disk_offloading/kimi_k2_example.py`, `examples/disk_offloading/qwen3_example.py`, `examples/imatrix/llama3_imatrix_example.py`, `examples/multimodal_vision/llava_example.py`, `examples/multimodal_vision/mistral3_example.py`, `examples/multimodal_vision/mllama_example.py`, `examples/multimodal_vision/pixtral_example.py`	Replaced `oneshot(..., splits={"calibration": ...})` with `oneshot(..., splits="...")` across example scripts to pass a single split string.
Test callsites updated to single-split `tests/llmcompressor/modifiers/transform/imatrix/test_e2e_integration.py`, `tests/llmcompressor/modifiers/transform/smoothquant/test_base.py`, `tests/llmcompressor/transformers/compression/test_compress_tensor_utils.py`, `tests/llmcompressor/transformers/compression/test_quantization.py`, `tests/llmcompressor/transformers/compression/test_recipe_parsing.py`, `tests/llmcompressor/transformers/gptq/test_gptq_oneshot.py`, `tests/llmcompressor/transformers/kv_cache/test_kv_cache.py`, `tests/llmcompressor/transformers/sparsegpt/test_oneshot_with_modifier.py`, `tests/llmcompressor/transformers/sparsegpt/test_sparsegpt_completion.py`	Updated tests and fixtures to pass `splits` as a string instead of a dict keyed by `"calibration"`.
Dataset tests & helpers `tests/llmcompressor/transformers/data/test_dataset_loading.py`, `tests/llmcompressor/transformers/data/test_dataset_helpers.py`	Expanded `test_dataset_loading.py` to include string and deprecated dict/list split variants, added warnings/error case tests and adjusted expectations to `datasets.Dataset` return. Deleted `test_dataset_helpers.py` which validated removed `make_dataset_splits` behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 52.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main refactoring: simplifying the splits interface by removing multi-split dict handling in favor of a plain string argument, which is the primary change throughout the PR.
Description check	✅ Passed	The description is directly related to the changeset, providing a clear summary of the refactoring, examples of before/after usage, and listing the specific changes made (updated examples, tests, backward compatibility).
Linked Issues check	✅ Passed	The PR successfully addresses all objectives from issue `#2551`: removes multi-split logic, simplifies splits to accept strings, maintains backward compatibility for dict with deprecation warning, and updates all examples and tests.
Out of Scope Changes check	✅ Passed	All changes are directly related to the splits refactoring objective. No unrelated modifications were introduced; import reorganization in debug.py is minimal and directly supports the splits interface changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

📋 Issue Planner

Built with CodeRabbit's Coding Plans for faster development and fewer bugs.

View plan used: #2551

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-08T18:30:17Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

gemini-code-assist

Code Review

This pull request simplifies dataset split handling by deprecating dictionary-based split configurations in favor of string-based formats. It updates the get_processed_dataset function, removes the now-redundant make_dataset_splits helper, and updates numerous examples and tests to reflect these changes. I have included a suggestion to improve the error message for invalid split types to provide better guidance to users.

gemini-code-assist · 2026-04-08T18:34:54Z

+                )
+                split_str = splits[0] if len(splits) > 0 else None
+            else:
+                raise ValueError(f"Invalid splits type: {type(splits)}. Expected string.")


The error message for invalid split types should be more descriptive to help users understand what types are supported, especially since dicts are now deprecated.

raise ValueError(f"Invalid splits type: {type(splits)}. Expected string (recommended) or dict (deprecated).")

Copilot

Pull request overview

Refactors dataset loading to simplify the splits interface for oneshot/calibration workflows by preferring a single split string (e.g., "train[:N]") and removing the multi-split dict output shape from get_processed_dataset.

Changes:

Refactored get_processed_dataset to return a single processed dataset (or None) and added deprecated dict-handling to extract a split string.
Updated tests and examples to pass splits as a string rather than {"calibration": ...}.
Removed now-obsolete dataset split helper test coverage (test_dataset_helpers.py) and added new split-focused unit tests.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/llmcompressor/datasets/utils.py	Refactors dataset processing to a single-split flow; updates calibration dataloader usage accordingly.
src/llmcompressor/datasets/init.py	Drops `make_dataset_splits` from public exports after refactor.
src/llmcompressor/args/dataset_arguments.py	Updates CLI/help text to recommend string `splits` and document backward compatibility.
tests/llmcompressor/transformers/data/test_dataset_loading.py	Updates split-loading assertions for new return type and adds coverage for deprecated dict inputs and invalid types.
tests/llmcompressor/transformers/data/test_dataset_helpers.py	Removes tests for helpers that no longer exist after refactor.
tests/llmcompressor/transformers/sparsegpt/test_sparsegpt_completion.py	Updates oneshot invocation to pass `splits` as a string.
tests/llmcompressor/transformers/sparsegpt/test_oneshot_with_modifier.py	Updates modifier-based oneshot test to pass `splits` as a string.
tests/llmcompressor/transformers/kv_cache/test_kv_cache.py	Updates kv-cache oneshot fixture to use string `splits`.
tests/llmcompressor/transformers/gptq/test_gptq_oneshot.py	Updates GPTQ oneshot test to use string `splits`.
tests/llmcompressor/transformers/compression/test_recipe_parsing.py	Updates recipe parsing config to use string `splits`.
tests/llmcompressor/transformers/compression/test_quantization.py	Updates quantization test setup to use string `splits`.
tests/llmcompressor/transformers/compression/test_compress_tensor_utils.py	Updates compression tensor utils test to use string `splits`.
tests/llmcompressor/modifiers/transform/smoothquant/test_base.py	Updates SmoothQuant e2e test to use string `splits`.
tests/llmcompressor/modifiers/transform/imatrix/test_e2e_integration.py	Updates iMatrix integration tests to use string `splits`.
examples/multimodal_vision/pixtral_example.py	Updates example to use string `splits`.
examples/multimodal_vision/mllama_example.py	Updates example to use string `splits`.
examples/multimodal_vision/mistral3_example.py	Updates example to use string `splits`.
examples/multimodal_vision/llava_example.py	Updates example to use string `splits`.
examples/imatrix/llama3_imatrix_example.py	Updates example to use string `splits`.
examples/disk_offloading/qwen3_example.py	Updates example to use string `splits`.
examples/disk_offloading/kimi_k2_example.py	Updates example to use string `splits`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/llmcompressor/transformers/data/test_dataset_loading.py (1)

199-265: ⚠️ Potential issue | 🟠 Major

Tighten these tests to the new splits contract.

These cases currently bless {"train": ...} as a valid deprecated input and allow TypeError, but this PR only keeps {"calibration": ...} on the compatibility path and says unsupported splits types should raise ValueError. As written, the tests will lock in the permissive fallback from src/llmcompressor/datasets/utils.py instead of guarding the stricter API.

🧪 Suggested test updates

 `@pytest.mark.parametrize`(
     "split_def",
     [
         "train[95%:]",
-        {"train": "train[:5%]"},                  # old dict (non-calibration key)
         {"calibration": "train[:5%]"},            # old dict (calibration key - main old format)
     ],
 )

 `@pytest.mark.unit`
 `@pytest.mark.parametrize`(
     "split_def",
     [
         {"calibration": "train[:5%]"},
-        {"train": "train[:5%]"},
     ],
 )
 def test_split_dict_emits_deprecation_warning(split_def, tiny_llama_tokenizer):

-@pytest.mark.unit
-def test_split_invalid_type_raises_value_error():
+@pytest.mark.unit
+@pytest.mark.parametrize(
+    "split_def",
+    [
+        12345,
+        {"train": "train[:5%]"},
+        ["train[:5%]"],
+    ],
+)
+def test_split_invalid_type_raises_value_error(split_def):
     """An unsupported splits type should raise ValueError."""
-    dataset_args = DatasetArguments(dataset="open_platypus", splits=12345)
-    with pytest.raises((ValueError, TypeError)):
+    dataset_args = DatasetArguments(dataset="open_platypus", splits=split_def)
+    with pytest.raises(ValueError):
         get_processed_dataset(dataset_args=dataset_args, processor=None)

As per coding guidelines, tests/**/*.py: Ensure PyTest tests are clear, comprehensive, and cover edge cases for quantization scenarios. Verify proper mocking and test isolation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/llmcompressor/transformers/data/test_dataset_loading.py` around lines
199 - 265, The tests currently accept {"train": ...} as a deprecated splits form
and allow TypeError, which conflicts with the tightened contract that only
{"calibration": ...} is supported on the compatibility path and unsupported
splits should raise ValueError; update test_split_loading to remove the
{"train": "train[:5%]"} case and only parametrize the new string form and the
{"calibration": "train[:5%]"} dict, change
test_split_dict_emits_deprecation_warning to only parametrize {"calibration":
"train[:5%]"} (remove {"train": ...}), and change
test_split_invalid_type_raises_value_error to assert only ValueError (remove
TypeError) when calling get_processed_dataset with an invalid splits type;
reference DatasetArguments and get_processed_dataset in these tests to locate
and modify the failing cases.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/llmcompressor/args/dataset_arguments.py`:
- Around line 143-145: Update the help text for the dataset split argument in
src/llmcompressor/args/dataset_arguments.py to replace the vague "dictionary or
a list" phrasing with the exact legacy compatibility shape; specifically state
the deprecated form as {"calibration": "<split-spec>"} (or a list of such dicts)
and mark it as legacy, and recommend using a string like 'train' or
'train[:50%]' instead—modify the metadata["help"] string where this argument is
defined to include that precise example and the deprecation note.

In `@src/llmcompressor/datasets/utils.py`:
- Around line 49-76: The current match on the variable splits accepts arbitrary
dicts and iterables and silently picks the first element; change it to only
accept None, str, or dicts that contain the "calibration" key and fail fast
otherwise: keep the None and str branches as-is, modify the dict() branch to
only extract splits["calibration"] and emit the deprecation logger.warning for
that case, and for any other dict or any non-str iterable (the previous case _
fallback) raise ValueError(f"Invalid splits shape: {type(splits)}. Expected
None, str, or dict with 'calibration' key.") instead of attempting to extract
the first element; update references to split_str and logger.warning accordingly
so only the allowed deprecation path is logged.

---

Outside diff comments:
In `@tests/llmcompressor/transformers/data/test_dataset_loading.py`:
- Around line 199-265: The tests currently accept {"train": ...} as a deprecated
splits form and allow TypeError, which conflicts with the tightened contract
that only {"calibration": ...} is supported on the compatibility path and
unsupported splits should raise ValueError; update test_split_loading to remove
the {"train": "train[:5%]"} case and only parametrize the new string form and
the {"calibration": "train[:5%]"} dict, change
test_split_dict_emits_deprecation_warning to only parametrize {"calibration":
"train[:5%]"} (remove {"train": ...}), and change
test_split_invalid_type_raises_value_error to assert only ValueError (remove
TypeError) when calling get_processed_dataset with an invalid splits type;
reference DatasetArguments and get_processed_dataset in these tests to locate
and modify the failing cases.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 34b76517-75d6-4345-8a30-0cae0c3fa6e3

📥 Commits

Reviewing files that changed from the base of the PR and between c18e9fd and 32117f2.

📒 Files selected for processing (21)

examples/disk_offloading/kimi_k2_example.py
examples/disk_offloading/qwen3_example.py
examples/imatrix/llama3_imatrix_example.py
examples/multimodal_vision/llava_example.py
examples/multimodal_vision/mistral3_example.py
examples/multimodal_vision/mllama_example.py
examples/multimodal_vision/pixtral_example.py
src/llmcompressor/args/dataset_arguments.py
src/llmcompressor/datasets/__init__.py
src/llmcompressor/datasets/utils.py
tests/llmcompressor/modifiers/transform/imatrix/test_e2e_integration.py
tests/llmcompressor/modifiers/transform/smoothquant/test_base.py
tests/llmcompressor/transformers/compression/test_compress_tensor_utils.py
tests/llmcompressor/transformers/compression/test_quantization.py
tests/llmcompressor/transformers/compression/test_recipe_parsing.py
tests/llmcompressor/transformers/data/test_dataset_helpers.py
tests/llmcompressor/transformers/data/test_dataset_loading.py
tests/llmcompressor/transformers/gptq/test_gptq_oneshot.py
tests/llmcompressor/transformers/kv_cache/test_kv_cache.py
tests/llmcompressor/transformers/sparsegpt/test_oneshot_with_modifier.py
tests/llmcompressor/transformers/sparsegpt/test_sparsegpt_completion.py

💤 Files with no reviewable changes (2)

src/llmcompressor/datasets/init.py
tests/llmcompressor/transformers/data/test_dataset_helpers.py

HDCharles

see comments and bot comments

arpitkh101 · 2026-04-08T21:31:45Z

Addressed the requested changes, thanks!

HDCharles

looks good

HDCharles · 2026-04-09T19:24:02Z

@coderabbitai re-review

coderabbitai · 2026-04-09T19:25:30Z

@HDCharles Sure, I'll re-review the PR now!

✅ Actions performed

Full review triggered.

HDCharles · 2026-04-13T14:03:19Z

Looks like the sparse test is failing? @arpitkh101

mergify · 2026-04-13T14:17:16Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @arpitkh101.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2026-04-22T21:56:04Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Require two reviews

Waiting for:

#approved-reviews-by >= 2

This rule is failing.

PRs labelled "two-reviews" must have at least two approving reviews before merging.

#approved-reviews-by >= 2
#changes-requested-reviews-by = 0

mergify · 2026-04-22T21:57:13Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @arpitkh101.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

HDCharles · 2026-04-23T13:45:42Z

@arpitkh101 can you resolve the merge conflicts?

Signed-off-by: Arpit <arpit@example.com>

…i-split handling Signed-off-by: Arpit <arpit@example.com>

arpitkh101 · 2026-04-24T09:27:03Z

I've rebased on latest main, Thanks!

arpitkh101 requested review from HDCharles, brian-dellabetta, dsikka and kylesayrs as code owners April 8, 2026 18:29

Copilot AI review requested due to automatic review settings April 8, 2026 18:29

Copilot started reviewing on behalf of arpitkh101 April 8, 2026 18:30 View session

arpitkh101 force-pushed the refactor-split branch from 32117f2 to 766d44c Compare April 8, 2026 18:33

gemini-code-assist Bot reviewed Apr 8, 2026

View reviewed changes

Copilot AI reviewed Apr 8, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread src/llmcompressor/args/dataset_arguments.py

Comment thread src/llmcompressor/datasets/utils.py Outdated

HDCharles reviewed Apr 8, 2026

View reviewed changes

Comment thread src/llmcompressor/datasets/utils.py Outdated

HDCharles reviewed Apr 8, 2026

View reviewed changes

Comment thread src/llmcompressor/datasets/utils.py

HDCharles requested changes Apr 8, 2026

View reviewed changes

arpitkh101 force-pushed the refactor-split branch from ee23c82 to ee01de7 Compare April 8, 2026 21:27

HDCharles added ready When a PR is ready for review Refactor Code cleanup and/or improvements to existing features labels Apr 9, 2026

HDCharles approved these changes Apr 9, 2026

View reviewed changes

mergify Bot added the needs-rebase label Apr 13, 2026

arpitkh101 force-pushed the refactor-split branch 2 times, most recently from 05aec86 to c8b64e5 Compare April 13, 2026 15:42

mergify Bot removed the needs-rebase label Apr 13, 2026

mergify Bot added the two-reviews When a PR requires two reviews label Apr 22, 2026

mergify Bot added the needs-rebase label Apr 22, 2026

Arpit added 2 commits April 24, 2026 05:02

Refactored splits to only use the calibration split

33b1d61

Signed-off-by: Arpit <arpit@example.com>

Refactor dataset splits to use string format and remove obsolete mult…

0a49b81

…i-split handling Signed-off-by: Arpit <arpit@example.com>

arpitkh101 force-pushed the refactor-split branch from 04ee4a7 to 0a49b81 Compare April 24, 2026 09:26

mergify Bot removed the needs-rebase label Apr 24, 2026

Conversation

arpitkh101 commented Apr 8, 2026

Summary

Examples & Tests

Before / After

Uh oh!

coderabbitai Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HDCharles left a comment

Choose a reason for hiding this comment

Uh oh!

arpitkh101 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HDCharles left a comment

Choose a reason for hiding this comment

Uh oh!

HDCharles commented Apr 9, 2026

Uh oh!

coderabbitai Bot commented Apr 9, 2026

Uh oh!

HDCharles commented Apr 13, 2026

Uh oh!

mergify Bot commented Apr 13, 2026

Uh oh!

mergify Bot commented Apr 22, 2026

Merge Protections

🔴 Require two reviews

Uh oh!

mergify Bot commented Apr 22, 2026

Uh oh!

HDCharles commented Apr 23, 2026

Uh oh!

arpitkh101 commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai Bot commented Apr 8, 2026 •

edited

Loading

arpitkh101 commented Apr 8, 2026 •

edited

Loading