Claude /skills exploration

Inspired by Liger's `liger-kernel-dev` Claude skill ([commit](https://github.com/linkedin/Liger-Kernel/commit/0169c4def22f4420615195c7706218552c7eb213)).
The reusable pattern: **analyze input → structured profile → generate known files → hard validation gate.**

**Underlying Logic:** Good skill candidates are tasks that (a) touch the same handful of files every time and (b) have a
correctness check expressible as concrete commands or a generated test.

---

## 1. Liger upstream-sync  

**Problem:** `integrations/liger/plugin.py` reconciles upstream `MODEL_TYPE_TO_APPLY_LIGER_FN` against a
hand-maintained override set (`axolotl_override_liger_fn`, line ~85) and a long `elif model_config_type ==`
chain of bespoke patches (qwen3_5, gemma4, jamba, deepseek_v2, llama4…). When Liger ships native support for
a type Axolotl still hand-patches, **CI stays green while the wrong kernel set is applied** — a silent
correctness/perf regression, not a version error. (Seen for real in the liger-0.8.0 bump: qwen3_5 /
qwen3_5_moe / gemma4_text override shadowing.)

**Skill:** diff installed `MODEL_TYPE_TO_APPLY_LIGER_FN` against the `elif` chain + override set; report
"upstream now natively supports X, Axolotl still hand-patches it" and "upstream changed the
`apply_liger_kernel_to_X` signature (stale `swiglu`/`glu` param check)."

**Validation gate:** for each covered `model_config_type`, assert which path fires (native vs override) and
that the expected modules (RMSNorm, SwiGLUMLP, FLCE forward) were actually swapped.

---

## 2. Prompt-strategy / dataset-format authoring  

**Problem:** Most repetitive contributor task. `type:` → module via importlib in
`prompt_strategies/__init__.py`. Every new format touches the same 3 places: a module in
`prompt_strategies/`, a golden test in `tests/prompt_strategies/`, a doc page in `docs/dataset-formats/`.

**Skill:** scaffold the strategy module, register the `type:`, generate the doc page, and emit a golden
round-trip test.

**Validation gate:** `axolotl preprocess --debug` for label-masking inspection + a generated test asserting
exact token IDs **and** the masked-label span (the loss-masking bug reviewers can't eyeball).

---

## 3. Integration / plugin scaffold  

**Problem:** `BasePlugin` exposes ~20 lifecycle hooks (`pre_model_load`, `post_model_build`,
`get_input_args`, `get_trainer_cls`, `create_optimizer`…). Contributors silently skip hooks.

**Skill:** scaffold `integrations/<name>/{__init__.py, args.py, README.md}`, wire `get_input_args`, add a
config example + load/parse smoke test.

**Validation gate:** plugin loads, args parse, lifecycle hooks no-op cleanly.

---

## 4. Config-field addition

**Problem:** Bounded but easy to under-test in a config-driven project. `docs/config-reference` is
auto-generated (`docs/scripts/generate_config_docs.py`), so the surface is: Pydantic schema in
`utils/schemas/`, validator, regen docs, test.

**Skill:** add the field + validator, regenerate docs, scaffold a schema test.

**Validation gate:** schema round-trips, validator rejects bad combos, docs regenerate clean.

---

## 5. "Patch-retirement" check  

**Problem:** Bumps are one-line + CI, but CI won't tell you which entries in `monkeypatch/` and integration
shims are now redundant because upstream transformers/trl/peft caught up.

**Skill:** scan `monkeypatch/` + shims, flag candidate-removable patches after a bump.

**Validation gate:** for each flagged patch, confirm upstream now provides equivalent behavior (test passes
with the patch disabled). Note: largely subsumed by #1 for the Liger `elif` chain.

---

## 6. Chat-template generator

**Problem:** `tests/prompt_strategies/` already has a large chat-template suite (`test_chat_templates*.py` —
thinking, tool-calls, mistral…). Adding a new template repeats an established pattern.

**Skill:** add a new chat template + auto-generate its rendered + tokenized snapshot test.

**Validation gate:** snapshot matches; tool-call / thinking / system-prompt variants render correctly.
Low-risk, high-acceptance.

---

## 7. Model verification harness 

**Problem:** Threads the maintainer's needle — the bespoke part of model support (attention/RoPE/mask
patches) can't be one skill, but *verifying* a model can.

**Skill:** given a model, scaffold tiny LoRA + full configs against a tiny variant, run preprocess + a few
train steps.

**Validation gate:** loss is sane, sample-packing behaves, Liger/CCE compat flags resolve correctly. A
reproducible smoke gate for model PRs.

---**

### ✔️ Solution

Was chatting with @NanoCode012 about whether any skills are worth adding getting inspiration from Liger

### ❓ Alternatives

There are other dev/testing oriented skills consistent across pytorch/sglang/nemo and others that are valid but they would be lesser value:

 **Training-run diagnostic** . The single most common pattern across all four repos: NeMo debug-training-logs, SGLang debug-distributed-hang + debug-cuda-crash, PyTorch distributed-triage + pt2-bug-basher. Axolotl already has the source material (docs/training_stability.qmd,  docs/debugging.qmd); the skill turns that prose into a guided "loss is NaN / OOM / hang / loss not decreasing → diagnose"  workflow. Glue over existing docs + commands, no new logic.

 **Issue → minimal-repro triage.** PyTorch triaging-issues/scrub-issue/fix-issue, Megatron respond-to-issue. Take a user's failing config, produce a minimal repro (tiny model + tiny dataset, preserving the SFT/DPO/GRPO/LoRA/multimodal path). 


### 📝 Additional Context

_No response_

### Acknowledgements

- [x] My issue title is concise, descriptive, and in title casing.
- [x] I have searched the existing issues to make sure this feature has not been requested yet.
- [x] I have provided enough information for the maintainers to understand and evaluate this request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Claude /skills exploration #3749

1. Liger upstream-sync

2. Prompt-strategy / dataset-format authoring

3. Integration / plugin scaffold

4. Config-field addition

5. "Patch-retirement" check

6. Chat-template generator

7. Model verification harness

✔️ Solution

❓ Alternatives

📝 Additional Context

Acknowledgements

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Claude /skills exploration #3749

Description

1. Liger upstream-sync

2. Prompt-strategy / dataset-format authoring

3. Integration / plugin scaffold

4. Config-field addition

5. "Patch-retirement" check

6. Chat-template generator

7. Model verification harness

✔️ Solution

❓ Alternatives

📝 Additional Context

Acknowledgements

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions