-
Notifications
You must be signed in to change notification settings - Fork 419
[Docs] Add supported model tables to pretrain_sft advanced tutorial #1728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| ../.claude/skills |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,123 @@ | ||
| --- | ||
| name: xtuner-sync-supported-models | ||
| description: Synchronize xtuner's supported model documentation (docs/en/pretrain_sft/advanced_tutorial/model.md and docs/zh_cn/pretrain_sft/advanced_tutorial/model.md) with the actual Config classes defined under xtuner/v1/model/. Use when (1) new TransformerConfig, MoEConfig, or BaseComposeConfig subclasses are added, removed, or renamed in xtuner/v1/model/, (2) existing model configs change their inheritance hierarchy, scale, or HuggingFace counterpart, or (3) a code review or user request points out that model.md is out of sync with the codebase. | ||
| --- | ||
|
|
||
| # Update XTuner Supported Model Docs | ||
|
|
||
| Keep the English and Chinese `model.md` files synchronized with the actual Config classes in `xtuner/v1/model/`. | ||
|
|
||
| ## Scan the Codebase | ||
|
|
||
| Run the bundled scan script from the xtuner project root to discover all Config classes and their inheritance: | ||
|
|
||
| ```bash | ||
| python3 .agents/skills/xtuner-sync-supported-models/scripts/scan_model_configs.py | ||
| ``` | ||
|
|
||
| The script outputs JSON with two keys: | ||
| - `configs`: list of every `*Config` class under `xtuner/v1/model/` with its parent classes and file path | ||
| - `children`: parent-to-children mapping for the hierarchy tree | ||
|
|
||
| ## What to Update | ||
|
|
||
| Compare the script output against the two files: | ||
| - `docs/en/pretrain_sft/advanced_tutorial/model.md` | ||
| - `docs/zh_cn/pretrain_sft/advanced_tutorial/model.md` | ||
|
|
||
| Both files share the same structure and must stay in sync: | ||
|
|
||
| 1. **Base Config Classes** — configs that directly inherit from `TransformerConfig` (or `MoEConfig`) and provide a `from_hf` classmethod for loading HuggingFace weights | ||
| 2. **Concrete Model Configs** — fixed-scale subclasses of the base configs above | ||
| 3. **Compose Models** — multimodal configs that inherit from `BaseComposeConfig` | ||
| 4. **Inheritance Hierarchy** — a text tree showing the full `XTunerBaseModelConfig` hierarchy | ||
|
|
||
| ### Rules for the Base Config table | ||
|
|
||
| Include these direct descendants of `TransformerConfig`/`MoEConfig`: | ||
| - `Qwen2DenseConfig` | ||
| - `Qwen3DenseConfig` | ||
| - `DeepSeekV3Config` | ||
| - `GptOssConfig` | ||
| - `Qwen3MoEConfig` | ||
|
|
||
| Exclude from the base table: | ||
| - `MoEConfig` — it is an intermediate base class, not a usable model family | ||
| - `Qwen3_5_VLTextMoEConfig` — it is an intermediate base with only one concrete child; its child `Qwen3_5_VLTextMoE35BA3BConfig` belongs under the MoE concrete table | ||
|
|
||
| ### Rules for the Concrete Model table | ||
|
|
||
| Include every concrete subclass that has fixed parameter defaults. For each row note: | ||
| - `Config Class` | ||
| - `Base Class / Family` | ||
| - `Architecture Type`: `Dense`, `MoE`, `Dense (VL backbone)`, `MoE (VL backbone)` | ||
| - `Scale / Notes`: parameter count or total/activated size; for VL backbones note "for multimodal" | ||
|
|
||
| `DeepSeekV3Config` appears here even though it has no separate base entry (it is both base and concrete). | ||
|
|
||
| ### Rules for the Compose Models section | ||
|
|
||
| Include three sub-tables: | ||
| 1. **Compose Base Config Classes** — `Qwen3VLBaseConfig`, `InternVLBaseConfig`, `InternS1BaseConfig` | ||
| - `Qwen3VLBaseConfig`: VL model based on Qwen3 text backbone | ||
| - `InternVLBaseConfig`: VL model based on InternViT + Qwen3 | ||
| - `InternS1BaseConfig`: Science multimodal model based on InternViT + Qwen3 | ||
| 2. **Concrete Compose Model Configs** — every subclass of the above bases; for each row note the wrapped `Text Config` and scale | ||
|
|
||
| ### Rules for the Inheritance Hierarchy tree | ||
|
|
||
| Rebuild the tree from `XTunerBaseModelConfig` with two top-level branches: | ||
|
|
||
| ```text | ||
| XTunerBaseModelConfig | ||
| ├── TransformerConfig | ||
| │ ├── Dense Models | ||
| │ │ ├── Qwen2DenseConfig | ||
| │ │ │ └── Qwen2Dense7BConfig | ||
| │ │ └── Qwen3DenseConfig | ||
| │ │ ├── Qwen3Dense8BConfig | ||
| │ │ ├── Qwen3Dense4BConfig | ||
| │ │ ├── Qwen3Dense0P6BConfig | ||
| │ │ ├── Qwen3VLTextDense4BConfig | ||
| │ │ └── Qwen3VLTextDense8BConfig | ||
| │ └── MoE Models (via MoEConfig) | ||
| │ ├── DeepSeekV3Config | ||
| │ ├── GptOssConfig | ||
| │ │ ├── GptOss21BA3P6Config | ||
| │ │ └── GptOss117BA5P8Config | ||
| │ ├── Qwen3MoEConfig | ||
| │ │ ├── Qwen3MoE30BA3Config | ||
| │ │ ├── Qwen3MoE235BA22Config | ||
| │ │ ├── Qwen3MoEFoPEConfig | ||
| │ │ ├── Qwen3VLTextMoE30BA3Config | ||
| │ │ └── Qwen3VLTextMoE235BA22Config | ||
| │ └── Qwen3_5_VLTextMoEConfig | ||
| │ └── Qwen3_5_VLTextMoE35BA3BConfig | ||
| └── BaseComposeConfig | ||
| ├── Qwen3VLBaseConfig | ||
| │ ├── Qwen3VLMoE30BA3Config | ||
| │ ├── Qwen3VLMoE235BA22Config | ||
| │ ├── Qwen3VLDense4BConfig | ||
| │ ├── Qwen3VLDense8BConfig | ||
| │ └── Qwen3_5_BaseConfig | ||
| │ └── Qwen3_5_VLMoE35BA3Config | ||
| ├── InternVLBaseConfig | ||
| │ ├── InternVL3P5Dense8BConfig | ||
| │ ├── InternVL3P5MoE30BA3Config | ||
| │ └── InternVL3P5Dense1BConfig | ||
| └── InternS1BaseConfig | ||
| ├── InternS1Config | ||
| └── InternS1MiniConfig | ||
| ``` | ||
|
|
||
| When new configs are added, insert them into the appropriate branch following the same indentation style. | ||
|
|
||
| ## Translation Notes | ||
|
|
||
| Keep the Chinese `model.md` (`docs/zh_cn/...`) structurally identical to the English one. Translate: | ||
| - Section headings | ||
| - Table header cells | ||
| - Description cells (e.g., "Image / Video + Text" → "图像/视频 + 文本") | ||
| - Scale descriptions (e.g., "~7B parameters" → "约 7B 参数", "FoPE variant" → "FoPE 变体") | ||
|
|
||
| Do **not** translate Config class names, file paths, or code identifiers. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| #!/usr/bin/env python3 | ||
| """Scan xtuner/v1/model for all Config classes and output model info as JSON.""" | ||
|
|
||
| import json | ||
| import re | ||
| import sys | ||
| from pathlib import Path | ||
|
|
||
| # We care about configs that are part of the supported model hierarchy | ||
| RELEVANT_BASES = { | ||
| "TransformerConfig", | ||
| "MoEConfig", | ||
| "BaseComposeConfig", | ||
| "XTunerBaseModelConfig", | ||
| # Known intermediate/family bases | ||
| "Qwen2DenseConfig", | ||
| "Qwen3DenseConfig", | ||
| "Qwen3MoEConfig", | ||
| "Qwen3_5_VLTextMoEConfig", | ||
| "GptOssConfig", | ||
| "DeepSeekV3Config", | ||
| "Qwen3VLBaseConfig", | ||
| "Qwen3_5_BaseConfig", | ||
| "InternVLBaseConfig", | ||
| "InternS1BaseConfig", | ||
| } | ||
|
|
||
|
|
||
| def scan_file(path: Path): | ||
| text = path.read_text() | ||
| # Match class definitions like: class FooConfig(BarConfig): | ||
| pattern = r"^class\s+(\w+Config)\s*\(([^)]+)\):" | ||
| results = [] | ||
| for m in re.finditer(pattern, text, re.MULTILINE): | ||
| class_name = m.group(1) | ||
| parents = [p.strip() for p in m.group(2).split(",")] | ||
| results.append({"class": class_name, "parents": parents, "file": str(path)}) | ||
| return results | ||
|
|
||
|
|
||
| def main(): | ||
| root = Path(sys.argv[1]) if len(sys.argv) > 1 else Path(".") | ||
| model_dir = root / "xtuner" / "v1" / "model" | ||
| if not model_dir.exists(): | ||
| print(f"Model directory not found: {model_dir}", file=sys.stderr) | ||
| sys.exit(1) | ||
|
|
||
| all_configs = [] | ||
| for py_file in sorted(model_dir.rglob("*.py")): | ||
| all_configs.extend(scan_file(py_file)) | ||
|
|
||
| # Build parent -> children map | ||
| children: dict[str, list[str]] = {} | ||
| for cfg in all_configs: | ||
| for p in cfg["parents"]: | ||
|
Comment on lines
+53
to
+55
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Claude: Nit — The condition |
||
| if p in RELEVANT_BASES or p.endswith("Config"): | ||
| children.setdefault(p, []).append(cfg["class"]) | ||
|
|
||
| # Deduplicate | ||
| for k in children: | ||
| children[k] = sorted(set(children[k])) | ||
|
|
||
| output = { | ||
| "configs": all_configs, | ||
| "children": children, | ||
| } | ||
| print(json.dumps(output, indent=2, ensure_ascii=False)) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,3 +1,110 @@ | ||||||||||||||||||||||||||||||||||||||||||
| # Model | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| Coming soon... | ||||||||||||||||||||||||||||||||||||||||||
| XTuner v1's `TrainEngine` supports a variety of Transformer architectures through different `TransformerConfig` subclasses. The documentation below summarizes the currently supported models (RL-related configs are excluded). | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ## Base Config Classes | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| The following table lists the **base config classes** that define each model family. They provide the `from_hf` interface for loading pretrained weights from HuggingFace. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| | Base Config Class | Model Family | Architecture Type | HuggingFace Counterpart | | ||||||||||||||||||||||||||||||||||||||||||
| |---|---|---|---| | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen2DenseConfig` | Qwen2 Dense | Dense | `Qwen2ForCausalLM` | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3DenseConfig` | Qwen3 Dense | Dense | `Qwen3ForCausalLM` | | ||||||||||||||||||||||||||||||||||||||||||
| | `DeepSeekV3Config` | DeepSeek-V3 | MoE | `DeepseekV3ForCausalLM` | | ||||||||||||||||||||||||||||||||||||||||||
| | `GptOssConfig` | GPT-OSS | MoE | `GptOssForCausalLM` | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3MoEConfig` | Qwen3 MoE | MoE | `Qwen3MoeForCausalLM` | | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ## Concrete Model Configs | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| The following table lists the **concrete model configs** that inherit from the base classes above. Each config corresponds to a specific model scale or variant. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| | Config Class | Base Class / Family | Architecture Type | Scale / Notes | | ||||||||||||||||||||||||||||||||||||||||||
| |---|---|---|---| | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen2Dense7BConfig` | `Qwen2DenseConfig` | Dense | ~7B parameters | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3Dense8BConfig` | `Qwen3DenseConfig` | Dense | ~8B parameters | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3Dense4BConfig` | `Qwen3DenseConfig` | Dense | ~4B parameters | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3Dense0P6BConfig` | `Qwen3DenseConfig` | Dense | ~0.6B parameters | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3VLTextDense4BConfig` | `Qwen3DenseConfig` | Dense (VL backbone) | ~4B parameters, for multimodal | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3VLTextDense8BConfig` | `Qwen3DenseConfig` | Dense (VL backbone) | ~8B parameters, for multimodal | | ||||||||||||||||||||||||||||||||||||||||||
| | `DeepSeekV3Config` | — | MoE | ~671B total / ~37B activated | | ||||||||||||||||||||||||||||||||||||||||||
| | `GptOss21BA3P6Config` | `GptOssConfig` | MoE | ~21B total / ~3.6B activated | | ||||||||||||||||||||||||||||||||||||||||||
| | `GptOss117BA5P8Config` | `GptOssConfig` | MoE | ~117B total / ~5.8B activated | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3MoE30BA3Config` | `Qwen3MoEConfig` | MoE | ~30B total / ~3B activated | | ||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+27
to
+32
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Claude: Critical — incorrect inheritance in table and tree The "Base Class / Family" for these VL text backbone configs is listed as
Source references:
The same error appears in the inheritance hierarchy tree at the bottom of this file (and in the Chinese version, the SKILL.md file).
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @claude here we only find the base class of such config, and it can indicate the config family of the config, if you agreed with me, please resolve this conversation. |
||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3MoE235BA22Config` | `Qwen3MoEConfig` | MoE | ~235B total / ~22B activated | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3MoEFoPEConfig` | `Qwen3MoEConfig` | MoE | FoPE (Frequency-based Position Embedding) variant | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3VLTextMoE30BA3Config` | `Qwen3MoEConfig` | MoE (VL backbone) | ~30B total, for multimodal | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3VLTextMoE235BA22Config` | `Qwen3MoEConfig` | MoE (VL backbone) | ~235B total, for multimodal | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3_5_VLTextMoE35BA3BConfig` | `Qwen3_5_VLTextMoEConfig` | MoE (VL backbone) | ~35B total / ~3B activated, for multimodal | | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ## Compose Models | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| In addition to pure text models, XTuner also supports **multimodal compose models** that combine a vision encoder, a projector, and a language model. These configs inherit from `BaseComposeConfig` rather than `TransformerConfig` directly, but they wrap the text configs listed above. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ### Compose Base Config Classes | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| | Base Config Class | Model Family | Modality | Description | | ||||||||||||||||||||||||||||||||||||||||||
| |---|---|---|---| | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3VLBaseConfig` | Qwen3-VL | Image / Video + Text | VL model based on Qwen3 text backbone | | ||||||||||||||||||||||||||||||||||||||||||
| | `InternVLBaseConfig` | InternVL | Image + Text | VL model based on InternViT + Qwen3 | | ||||||||||||||||||||||||||||||||||||||||||
| | `InternS1BaseConfig` | InternS1 | Image + Text | Science multimodal model based on InternViT + Qwen3 | | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ### Concrete Compose Model Configs | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| | Config Class | Compose Base / Family | Text Config | Scale / Notes | | ||||||||||||||||||||||||||||||||||||||||||
| |---|---|---|---| | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3VLMoE30BA3Config` | `Qwen3VLBaseConfig` | `Qwen3VLTextMoE30BA3Config` | ~30B total, MoE VL | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3VLMoE235BA22Config` | `Qwen3VLBaseConfig` | `Qwen3VLTextMoE235BA22Config` | ~235B total, MoE VL | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3VLDense4BConfig` | `Qwen3VLBaseConfig` | `Qwen3VLTextDense4BConfig` | ~4B parameters, Dense VL | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3VLDense8BConfig` | `Qwen3VLBaseConfig` | `Qwen3VLTextDense8BConfig` | ~8B parameters, Dense VL | | ||||||||||||||||||||||||||||||||||||||||||
| | `Qwen3_5_VLMoE35BA3Config` | `Qwen3_5_BaseConfig` | `Qwen3_5_VLTextMoE35BA3BConfig` | ~35B total / ~3B activated, MoE VL | | ||||||||||||||||||||||||||||||||||||||||||
| | `InternVL3P5Dense8BConfig` | `InternVLBaseConfig` | `Qwen3Dense8BConfig` | ~8B parameters, Dense VL | | ||||||||||||||||||||||||||||||||||||||||||
| | `InternVL3P5MoE30BA3Config` | `InternVLBaseConfig` | `Qwen3MoE30BA3Config` | ~30B total, MoE VL | | ||||||||||||||||||||||||||||||||||||||||||
| | `InternVL3P5Dense1BConfig` | `InternVLBaseConfig` | `Qwen3Dense0P6BConfig` | ~1B parameters, Dense VL | | ||||||||||||||||||||||||||||||||||||||||||
| | `InternS1Config` | `InternS1BaseConfig` | `Qwen3MoE235BA22Config` | ~235B total, MoE multimodal | | ||||||||||||||||||||||||||||||||||||||||||
| | `InternS1MiniConfig` | `InternS1BaseConfig` | `Qwen3Dense8BConfig` | ~8B parameters, Dense multimodal | | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ## Inheritance Hierarchy | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| The following diagram shows the complete inheritance hierarchy of all config classes supported by `TrainEngine`, including both `TransformerConfig` and `BaseComposeConfig` branches. | ||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||
| ```text | ||||||||||||||||||||||||||||||||||||||||||
| XTunerBaseModelConfig | ||||||||||||||||||||||||||||||||||||||||||
| ├── TransformerConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ ├── Dense Models | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── Qwen2DenseConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ │ └── Qwen2Dense7BConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ └── Qwen3DenseConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── Qwen3Dense8BConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── Qwen3Dense4BConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── Qwen3Dense0P6BConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── Qwen3VLTextDense4BConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ └── Qwen3VLTextDense8BConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ └── MoE Models (via MoEConfig) | ||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+72
to
+82
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Claude: Critical — inheritance tree does not match actual code The VL text backbone configs are shown as direct children of the family base, but they actually inherit from the concrete scale configs. The corrected tree should be:
Suggested change
Similarly for the MoE section below, |
||||||||||||||||||||||||||||||||||||||||||
| │ ├── DeepSeekV3Config | ||||||||||||||||||||||||||||||||||||||||||
| │ ├── GptOssConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── GptOss21BA3P6Config | ||||||||||||||||||||||||||||||||||||||||||
| │ │ └── GptOss117BA5P8Config | ||||||||||||||||||||||||||||||||||||||||||
| │ ├── Qwen3MoEConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── Qwen3MoE30BA3Config | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── Qwen3MoE235BA22Config | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── Qwen3MoEFoPEConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ │ ├── Qwen3VLTextMoE30BA3Config | ||||||||||||||||||||||||||||||||||||||||||
| │ │ └── Qwen3VLTextMoE235BA22Config | ||||||||||||||||||||||||||||||||||||||||||
| │ └── Qwen3_5_VLTextMoEConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ └── Qwen3_5_VLTextMoE35BA3BConfig | ||||||||||||||||||||||||||||||||||||||||||
| └── BaseComposeConfig | ||||||||||||||||||||||||||||||||||||||||||
| ├── Qwen3VLBaseConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ ├── Qwen3VLMoE30BA3Config | ||||||||||||||||||||||||||||||||||||||||||
| │ ├── Qwen3VLMoE235BA22Config | ||||||||||||||||||||||||||||||||||||||||||
| │ ├── Qwen3VLDense4BConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ ├── Qwen3VLDense8BConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ └── Qwen3_5_BaseConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ └── Qwen3_5_VLMoE35BA3Config | ||||||||||||||||||||||||||||||||||||||||||
| ├── InternVLBaseConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ ├── InternVL3P5Dense8BConfig | ||||||||||||||||||||||||||||||||||||||||||
| │ ├── InternVL3P5MoE30BA3Config | ||||||||||||||||||||||||||||||||||||||||||
| │ └── InternVL3P5Dense1BConfig | ||||||||||||||||||||||||||||||||||||||||||
| └── InternS1BaseConfig | ||||||||||||||||||||||||||||||||||||||||||
| ├── InternS1Config | ||||||||||||||||||||||||||||||||||||||||||
| └── InternS1MiniConfig | ||||||||||||||||||||||||||||||||||||||||||
| ``` | ||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Claude: Warning — missing type hints on function signatures
Per CLAUDE.md: "All new code must include type hints for function signatures (parameters and return types)."
scan_fileandmainare both missing return type annotations and parameter type hints:And
mainshould have-> None.