add example of w8a8fp8 for qwen3.5 #2631

gemini-code-assist · 2026-04-20T08:43:58Z

The imports torch and load_dataset are not used in this script. Since oneshot can accept a dataset name as a string, load_dataset is unnecessary. These should be removed to keep the example clean.

gemini-code-assist · 2026-04-20T08:43:59Z

The ignore list includes patterns for visual modules (re:visual.*, re:model.visual.*) and linear_attn, which appear to be redundant for the Qwen3_5Moe text model architecture. Removing these makes the recipe more concise and specific to the model.

ignore=[ "re:.*lm_head", "re:.*mlp.gate$", "re:.*embed_tokens$", "re:.*shared_expert_gate$", ],

gemini-code-assist · 2026-04-20T08:43:59Z

The oneshot call is missing a dataset and processor. Providing a calibration dataset is standard for the oneshot workflow to ensure proper initialization of the quantization modifiers and to provide a complete example. Additionally, passing the processor explicitly is recommended since it has already been loaded.

Suggested change

oneshot(

model=model,

recipe=recipe)

oneshot(

model=model,

dataset="open_platypus",

recipe=recipe,

processor=processor,

)

coderabbitai · 2026-04-20T08:48:54Z

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Read-only verification of oneshot recipe handling and usage patterns rg -n -C4 '^def oneshot\(' src/llmcompressor/entrypoints/oneshot.py rg -n -C4 'recipe:' src/llmcompressor/entrypoints/oneshot.py rg -n -C4 'isinstance\(recipe|parse.*recipe|load.*recipe|Modifier' src/llmcompressor/entrypoints/oneshot.py rg -n --type=py -C2 'oneshot\(' examples

Repository: vllm-project/llm-compressor

Length of output: 38177

🏁 Script executed:

#!/bin/bash # Check the definition of recipe in qwen3_5_example.py head -40 examples/quantization_w8a8_fp8/qwen3_5_example.py | tail -20

Repository: vllm-project/llm-compressor

Length of output: 526

🏁 Script executed:

#!/bin/bash # Search for how recipe parameter is actually used inside oneshot function rg -A 20 'def oneshot\(' src/llmcompressor/entrypoints/oneshot.py | head -50

Repository: vllm-project/llm-compressor

Length of output: 850

🏁 Script executed:

#!/bin/bash # Check if recipe is coerced or converted to string, or if objects are accepted rg -n 'recipe' src/llmcompressor/entrypoints/oneshot.py | grep -E '(isinstance|type|convert|parse|Modifier)' | head -20

Repository: vllm-project/llm-compressor

Length of output: 241

🏁 Script executed:

#!/bin/bash # Read full qwen3_5_example.py to see recipe definition cat -n examples/quantization_w8a8_fp8/qwen3_5_example.py

Repository: vllm-project/llm-compressor

Length of output: 1796

🏁 Script executed:

#!/bin/bash # Find parse_args function to understand recipe parameter handling rg -n 'def parse_args' src/llmcompressor/entrypoints/oneshot.py

Repository: vllm-project/llm-compressor

Length of output: 53

🏁 Script executed:

#!/bin/bash # Check oneshot.py imports and look for parse_args head -100 src/llmcompressor/entrypoints/oneshot.py

Repository: vllm-project/llm-compressor

Length of output: 4047

🏁 Script executed:

#!/bin/bash # Search for parse_args in entire src directory rg -n 'def parse_args' src/

Repository: vllm-project/llm-compressor

Length of output: 277

🏁 Script executed:

#!/bin/bash # Read parse_args implementation sed -n '21,150p' src/llmcompressor/args/utils.py

Repository: vllm-project/llm-compressor

Length of output: 1702

🏁 Script executed:

#!/bin/bash # Check how recipe is handled after parse_args in oneshot function rg -A 150 'def oneshot\(' src/llmcompressor/entrypoints/oneshot.py | head -200

Repository: vllm-project/llm-compressor

Length of output: 7783

🏁 Script executed:

#!/bin/bash # Find RecipeArguments class to see how it handles recipe parameter rg -n 'class RecipeArguments' src/

Repository: vllm-project/llm-compressor

Length of output: 139

🏁 Script executed:

#!/bin/bash # Read RecipeArguments implementation find src -name "*recipe*args*" -type f

Repository: vllm-project/llm-compressor

Length of output: 53

🏁 Script executed:

#!/bin/bash # Read RecipeArguments class cat -n src/llmcompressor/args/recipe_arguments.py

Repository: vllm-project/llm-compressor

Length of output: 1532

🏁 Script executed:

#!/bin/bash # Look for how recipe is actually processed in Oneshot class rg -A 50 'class Oneshot:' src/llmcompressor/entrypoints/oneshot.py | head -80

Repository: vllm-project/llm-compressor

Length of output: 2226

🏁 Script executed:

#!/bin/bash # Read the Oneshot.__init__ method to see how recipe is actually handled rg -A 100 'def __init__' src/llmcompressor/entrypoints/oneshot.py | head -150

Repository: vllm-project/llm-compressor

Length of output: 4009

🏁 Script executed:

#!/bin/bash # Check how Oneshot uses recipe after initialization rg -n 'self\.recipe' src/llmcompressor/entrypoints/oneshot.py

Repository: vllm-project/llm-compressor

Length of output: 364

🏁 Script executed:

#!/bin/bash # Look at apply_recipe_modifiers to see how recipe is used rg -A 100 'def apply_recipe_modifiers' src/llmcompressor/entrypoints/oneshot.py

Repository: vllm-project/llm-compressor

Length of output: 3644

🏁 Script executed:

#!/bin/bash # Check if there's any special handling for Modifier objects in parse_args or elsewhere rg -n 'isinstance.*Modifier|Modifier.*recipe' src/llmcompressor/entrypoints/oneshot.py

Repository: vllm-project/llm-compressor

Length of output: 53

🏁 Script executed:

#!/bin/bash # Check if HfArgumentParser from transformers can handle non-string objects # Let's look at how parse_dict is actually used and if there's special handling rg -B 5 -A 15 'parser.parse_dict' src/llmcompressor/args/utils.py

Repository: vllm-project/llm-compressor

Length of output: 855

🏁 Script executed:

#!/bin/bash # Check if there's any special isinstance or type checking for recipe before parse_args rg -B 10 'parse_args\(' src/llmcompressor/entrypoints/oneshot.py | head -30

Repository: vllm-project/llm-compressor

Length of output: 457

🏁 Script executed:

#!/bin/bash # Check what HfArgumentParser.parse_dict does with non-matching types # Look for how recipe value is actually handled when it's not a string python3 << 'EOF' from transformers.hf_argparse import HfArgumentParser from dataclasses import dataclass from typing import Optional `@dataclass` class TestArgs: recipe: Optional[str] = None parser = HfArgumentParser(TestArgs) # Test 1: passing a string (normal case) try: result1 = parser.parse_dict({"recipe": "path/to/recipe.yaml"}) print(f"Test 1 (string): SUCCESS - {result1}") except Exception as e: print(f"Test 1 (string): FAILED - {type(e).__name__}: {e}") # Test 2: passing a non-string object class FakeModifier: def __init__(self): self.name = "test_modifier" try: result2 = parser.parse_dict({"recipe": FakeModifier()}) print(f"Test 2 (object): SUCCESS - recipe type = {type(result2[0].recipe)}") print(f" recipe value = {result2[0].recipe}") except Exception as e: print(f"Test 2 (object): FAILED - {type(e).__name__}: {e}") EOF

Repository: vllm-project/llm-compressor

Length of output: 195

🏁 Script executed:

#!/bin/bash # Check session.initialize to see how it handles recipe parameter rg -n 'def initialize' src/llmcompressor/core/session_functions.py

Repository: vllm-project/llm-compressor

Length of output: 53

🏁 Script executed:

#!/bin/bash # Search for how recipe is actually processed in CompressionSession or related classes rg -n 'def initialize' src/llmcompressor/core/ | head -20

Repository: vllm-project/llm-compressor

Length of output: 186

🏁 Script executed:

#!/bin/bash # Check session.initialize signature and how it handles recipe sed -n '74,150p' src/llmcompressor/core/session.py

Repository: vllm-project/llm-compressor

Length of output: 3509

🏁 Script executed:

#!/bin/bash # Check lifecycle.initialize sed -n '73,150p' src/llmcompressor/core/lifecycle.py

Repository: vllm-project/llm-compressor

Length of output: 2616

🏁 Script executed:

#!/bin/bash # Check RecipeInput type definition rg -n 'RecipeInput' src/llmcompressor/core/lifecycle.py

Repository: vllm-project/llm-compressor

Length of output: 207

🏁 Script executed:

#!/bin/bash # Find Recipe class and create_instance method rg -n 'class Recipe' src/ | head -5

Repository: vllm-project/llm-compressor

Length of output: 202

🏁 Script executed:

#!/bin/bash # Find RecipeInput definition rg -n 'RecipeInput' src/llmcompressor/recipe/__init__.py

Repository: vllm-project/llm-compressor

Length of output: 170

🏁 Script executed:

#!/bin/bash # Find RecipeInput definition in recipe module rg -n 'RecipeInput' src/llmcompressor/recipe/ | head -10

Repository: vllm-project/llm-compressor

Length of output: 424

🏁 Script executed:

#!/bin/bash # Check Recipe.create_instance to confirm it handles Modifier objects rg -A 50 'def create_instance' src/llmcompressor/recipe/recipe.py | head -80

Repository: vllm-project/llm-compressor

Length of output: 2198

🏁 Script executed:

#!/bin/bash # Verify QuantizationModifier is a subclass of Modifier rg -n 'class QuantizationModifier' src/llmcompressor/modifiers/quantization/

Repository: vllm-project/llm-compressor

Length of output: 192

🏁 Script executed:

#!/bin/bash # Check if there's any special handling of recipe parameter BEFORE it reaches parse_args # in the oneshot function sed -n '249,380p' src/llmcompressor/entrypoints/oneshot.py | tail -100

Repository: vllm-project/llm-compressor

Length of output: 5249

🏁 Script executed:

#!/bin/bash # Check the full oneshot function body after the signature to see if recipe is processed rg -A 200 'def oneshot\(' src/llmcompressor/entrypoints/oneshot.py | grep -A 50 'model_args, dataset_args'

Repository: vllm-project/llm-compressor

Length of output: 53

Update type hints or documentation to support QuantizationModifier objects in recipe parameter.

The qwen3_5_example.py passes a QuantizationModifier object as the recipe argument (line 19, 36), but oneshot() at line 262 declares recipe: str | list[str] | None = None, and RecipeArguments declares recipe: str | None. While Recipe.create_instance() does accept Modifier objects via its RecipeInput union type (which includes Modifier | List[Modifier]), the type hints along the call chain—particularly in oneshot() and RecipeArguments—do not reflect this capability. This creates a type safety inconsistency; the example works because Recipe.create_instance() internally handles Modifier objects, but the type annotations should either be updated to include Modifier | List[Modifier], or the docstring should explicitly document that Modifier instances are accepted.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/quantization_w8a8_fp8/qwen3_5_example.py` around lines 34 - 36, The example passes a QuantizationModifier to oneshot(), but oneshot() and RecipeArguments only type recipe as str | list[str] | None (and RecipeArguments.recipe: str | None), causing a type mismatch; update the type hints to accept Modifier | list[Modifier] (or QuantizationModifier | list[QuantizationModifier]) so the signatures align with Recipe.create_instance() which accepts RecipeInput (Modifier | list[Modifier] | str | list[str]), and update any related docstrings to mention Modifier instances are supported; specifically modify the oneshot() parameter annotation, RecipeArguments.recipe annotation, and any docs referencing recipe to include Modifier (or use the Modifier union) so QuantizationModifier passes type checking.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add example of w8a8fp8 for qwen3.5 #2631

Diff view

Diff view

There are no files selected for viewing

gemini-code-assist Bot Apr 20, 2026

Uh oh!

gemini-code-assist Bot Apr 20, 2026

Uh oh!

gemini-code-assist Bot Apr 20, 2026

Uh oh!

coderabbitai Bot Apr 20, 2026

Uh oh!

Uh oh!

add example of w8a8fp8 for qwen3.5 #2631

Are you sure you want to change the base?

add example of w8a8fp8 for qwen3.5 #2631

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

gemini-code-assist Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!