[Refactor] Refactor splits to only use the "calibration" split (#2551)#2589
[Refactor] Refactor splits to only use the "calibration" split (#2551)#2589arpitkh101 wants to merge 2 commits intovllm-project:mainfrom
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughRefactors dataset split handling from a multi-split/dict-based API (e.g., Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 📋 Issue PlannerBuilt with CodeRabbit's Coding Plans for faster development and fewer bugs. View plan used: ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
32117f2 to
766d44c
Compare
There was a problem hiding this comment.
Code Review
This pull request simplifies dataset split handling by deprecating dictionary-based split configurations in favor of string-based formats. It updates the get_processed_dataset function, removes the now-redundant make_dataset_splits helper, and updates numerous examples and tests to reflect these changes. I have included a suggestion to improve the error message for invalid split types to provide better guidance to users.
| ) | ||
| split_str = splits[0] if len(splits) > 0 else None | ||
| else: | ||
| raise ValueError(f"Invalid splits type: {type(splits)}. Expected string.") |
There was a problem hiding this comment.
There was a problem hiding this comment.
Pull request overview
Refactors dataset loading to simplify the splits interface for oneshot/calibration workflows by preferring a single split string (e.g., "train[:N]") and removing the multi-split dict output shape from get_processed_dataset.
Changes:
- Refactored
get_processed_datasetto return a single processed dataset (orNone) and added deprecated dict-handling to extract a split string. - Updated tests and examples to pass
splitsas a string rather than{"calibration": ...}. - Removed now-obsolete dataset split helper test coverage (
test_dataset_helpers.py) and added new split-focused unit tests.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/llmcompressor/datasets/utils.py | Refactors dataset processing to a single-split flow; updates calibration dataloader usage accordingly. |
| src/llmcompressor/datasets/init.py | Drops make_dataset_splits from public exports after refactor. |
| src/llmcompressor/args/dataset_arguments.py | Updates CLI/help text to recommend string splits and document backward compatibility. |
| tests/llmcompressor/transformers/data/test_dataset_loading.py | Updates split-loading assertions for new return type and adds coverage for deprecated dict inputs and invalid types. |
| tests/llmcompressor/transformers/data/test_dataset_helpers.py | Removes tests for helpers that no longer exist after refactor. |
| tests/llmcompressor/transformers/sparsegpt/test_sparsegpt_completion.py | Updates oneshot invocation to pass splits as a string. |
| tests/llmcompressor/transformers/sparsegpt/test_oneshot_with_modifier.py | Updates modifier-based oneshot test to pass splits as a string. |
| tests/llmcompressor/transformers/kv_cache/test_kv_cache.py | Updates kv-cache oneshot fixture to use string splits. |
| tests/llmcompressor/transformers/gptq/test_gptq_oneshot.py | Updates GPTQ oneshot test to use string splits. |
| tests/llmcompressor/transformers/compression/test_recipe_parsing.py | Updates recipe parsing config to use string splits. |
| tests/llmcompressor/transformers/compression/test_quantization.py | Updates quantization test setup to use string splits. |
| tests/llmcompressor/transformers/compression/test_compress_tensor_utils.py | Updates compression tensor utils test to use string splits. |
| tests/llmcompressor/modifiers/transform/smoothquant/test_base.py | Updates SmoothQuant e2e test to use string splits. |
| tests/llmcompressor/modifiers/transform/imatrix/test_e2e_integration.py | Updates iMatrix integration tests to use string splits. |
| examples/multimodal_vision/pixtral_example.py | Updates example to use string splits. |
| examples/multimodal_vision/mllama_example.py | Updates example to use string splits. |
| examples/multimodal_vision/mistral3_example.py | Updates example to use string splits. |
| examples/multimodal_vision/llava_example.py | Updates example to use string splits. |
| examples/imatrix/llama3_imatrix_example.py | Updates example to use string splits. |
| examples/disk_offloading/qwen3_example.py | Updates example to use string splits. |
| examples/disk_offloading/kimi_k2_example.py | Updates example to use string splits. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/llmcompressor/transformers/data/test_dataset_loading.py (1)
199-265:⚠️ Potential issue | 🟠 MajorTighten these tests to the new
splitscontract.These cases currently bless
{"train": ...}as a valid deprecated input and allowTypeError, but this PR only keeps{"calibration": ...}on the compatibility path and says unsupportedsplitstypes should raiseValueError. As written, the tests will lock in the permissive fallback fromsrc/llmcompressor/datasets/utils.pyinstead of guarding the stricter API.🧪 Suggested test updates
`@pytest.mark.parametrize`( "split_def", [ "train[95%:]", - {"train": "train[:5%]"}, # old dict (non-calibration key) {"calibration": "train[:5%]"}, # old dict (calibration key - main old format) ], )`@pytest.mark.unit` `@pytest.mark.parametrize`( "split_def", [ {"calibration": "train[:5%]"}, - {"train": "train[:5%]"}, ], ) def test_split_dict_emits_deprecation_warning(split_def, tiny_llama_tokenizer):-@pytest.mark.unit -def test_split_invalid_type_raises_value_error(): +@pytest.mark.unit +@pytest.mark.parametrize( + "split_def", + [ + 12345, + {"train": "train[:5%]"}, + ["train[:5%]"], + ], +) +def test_split_invalid_type_raises_value_error(split_def): """An unsupported splits type should raise ValueError.""" - dataset_args = DatasetArguments(dataset="open_platypus", splits=12345) - with pytest.raises((ValueError, TypeError)): + dataset_args = DatasetArguments(dataset="open_platypus", splits=split_def) + with pytest.raises(ValueError): get_processed_dataset(dataset_args=dataset_args, processor=None)As per coding guidelines,
tests/**/*.py: Ensure PyTest tests are clear, comprehensive, and cover edge cases for quantization scenarios. Verify proper mocking and test isolation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/llmcompressor/transformers/data/test_dataset_loading.py` around lines 199 - 265, The tests currently accept {"train": ...} as a deprecated splits form and allow TypeError, which conflicts with the tightened contract that only {"calibration": ...} is supported on the compatibility path and unsupported splits should raise ValueError; update test_split_loading to remove the {"train": "train[:5%]"} case and only parametrize the new string form and the {"calibration": "train[:5%]"} dict, change test_split_dict_emits_deprecation_warning to only parametrize {"calibration": "train[:5%]"} (remove {"train": ...}), and change test_split_invalid_type_raises_value_error to assert only ValueError (remove TypeError) when calling get_processed_dataset with an invalid splits type; reference DatasetArguments and get_processed_dataset in these tests to locate and modify the failing cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/llmcompressor/args/dataset_arguments.py`:
- Around line 143-145: Update the help text for the dataset split argument in
src/llmcompressor/args/dataset_arguments.py to replace the vague "dictionary or
a list" phrasing with the exact legacy compatibility shape; specifically state
the deprecated form as {"calibration": "<split-spec>"} (or a list of such dicts)
and mark it as legacy, and recommend using a string like 'train' or
'train[:50%]' instead—modify the metadata["help"] string where this argument is
defined to include that precise example and the deprecation note.
In `@src/llmcompressor/datasets/utils.py`:
- Around line 49-76: The current match on the variable splits accepts arbitrary
dicts and iterables and silently picks the first element; change it to only
accept None, str, or dicts that contain the "calibration" key and fail fast
otherwise: keep the None and str branches as-is, modify the dict() branch to
only extract splits["calibration"] and emit the deprecation logger.warning for
that case, and for any other dict or any non-str iterable (the previous case _
fallback) raise ValueError(f"Invalid splits shape: {type(splits)}. Expected
None, str, or dict with 'calibration' key.") instead of attempting to extract
the first element; update references to split_str and logger.warning accordingly
so only the allowed deprecation path is logged.
---
Outside diff comments:
In `@tests/llmcompressor/transformers/data/test_dataset_loading.py`:
- Around line 199-265: The tests currently accept {"train": ...} as a deprecated
splits form and allow TypeError, which conflicts with the tightened contract
that only {"calibration": ...} is supported on the compatibility path and
unsupported splits should raise ValueError; update test_split_loading to remove
the {"train": "train[:5%]"} case and only parametrize the new string form and
the {"calibration": "train[:5%]"} dict, change
test_split_dict_emits_deprecation_warning to only parametrize {"calibration":
"train[:5%]"} (remove {"train": ...}), and change
test_split_invalid_type_raises_value_error to assert only ValueError (remove
TypeError) when calling get_processed_dataset with an invalid splits type;
reference DatasetArguments and get_processed_dataset in these tests to locate
and modify the failing cases.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 34b76517-75d6-4345-8a30-0cae0c3fa6e3
📒 Files selected for processing (21)
examples/disk_offloading/kimi_k2_example.pyexamples/disk_offloading/qwen3_example.pyexamples/imatrix/llama3_imatrix_example.pyexamples/multimodal_vision/llava_example.pyexamples/multimodal_vision/mistral3_example.pyexamples/multimodal_vision/mllama_example.pyexamples/multimodal_vision/pixtral_example.pysrc/llmcompressor/args/dataset_arguments.pysrc/llmcompressor/datasets/__init__.pysrc/llmcompressor/datasets/utils.pytests/llmcompressor/modifiers/transform/imatrix/test_e2e_integration.pytests/llmcompressor/modifiers/transform/smoothquant/test_base.pytests/llmcompressor/transformers/compression/test_compress_tensor_utils.pytests/llmcompressor/transformers/compression/test_quantization.pytests/llmcompressor/transformers/compression/test_recipe_parsing.pytests/llmcompressor/transformers/data/test_dataset_helpers.pytests/llmcompressor/transformers/data/test_dataset_loading.pytests/llmcompressor/transformers/gptq/test_gptq_oneshot.pytests/llmcompressor/transformers/kv_cache/test_kv_cache.pytests/llmcompressor/transformers/sparsegpt/test_oneshot_with_modifier.pytests/llmcompressor/transformers/sparsegpt/test_sparsegpt_completion.py
💤 Files with no reviewable changes (2)
- src/llmcompressor/datasets/init.py
- tests/llmcompressor/transformers/data/test_dataset_helpers.py
HDCharles
left a comment
There was a problem hiding this comment.
see comments and bot comments
ee23c82 to
ee01de7
Compare
|
Addressed the requested changes, thanks! |
|
@coderabbitai re-review |
|
✅ Actions performedFull review triggered. |
|
Looks like the sparse test is failing? @arpitkh101 |
|
This pull request has merge conflicts that must be resolved before it can be |
05aec86 to
c8b64e5
Compare
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🔴 Require two reviewsWaiting for:
This rule is failing.PRs labelled "two-reviews" must have at least two approving reviews before merging.
|
|
This pull request has merge conflicts that must be resolved before it can be |
|
@arpitkh101 can you resolve the merge conflicts? |
Signed-off-by: Arpit <arpit@example.com>
…i-split handling Signed-off-by: Arpit <arpit@example.com>
04ee4a7 to
0a49b81
Compare
|
I've rebased on latest main, Thanks! |
Summary
Closes #2551
Simplifies the
splitsinterface in get_processed_dataset by removingmulti-split dict handling in favour of a plain string argument.
Examples & Tests
splits="train[:N]"string format.test_dataset_helpers.py(helpers no longer exist).{"calibration": ...}dict backward compatsplits=NonereturnsNone(data-free flow)ValueErrorBefore / After