Inspired by Liger's liger-kernel-dev Claude skill (commit).
The reusable pattern: analyze input → structured profile → generate known files → hard validation gate.
Underlying Logic: Good skill candidates are tasks that (a) touch the same handful of files every time and (b) have a
correctness check expressible as concrete commands or a generated test.
1. Liger upstream-sync
Problem: integrations/liger/plugin.py reconciles upstream MODEL_TYPE_TO_APPLY_LIGER_FN against a
hand-maintained override set (axolotl_override_liger_fn, line ~85) and a long elif model_config_type ==
chain of bespoke patches (qwen3_5, gemma4, jamba, deepseek_v2, llama4…). When Liger ships native support for
a type Axolotl still hand-patches, CI stays green while the wrong kernel set is applied — a silent
correctness/perf regression, not a version error. (Seen for real in the liger-0.8.0 bump: qwen3_5 /
qwen3_5_moe / gemma4_text override shadowing.)
Skill: diff installed MODEL_TYPE_TO_APPLY_LIGER_FN against the elif chain + override set; report
"upstream now natively supports X, Axolotl still hand-patches it" and "upstream changed the
apply_liger_kernel_to_X signature (stale swiglu/glu param check)."
Validation gate: for each covered model_config_type, assert which path fires (native vs override) and
that the expected modules (RMSNorm, SwiGLUMLP, FLCE forward) were actually swapped.
2. Prompt-strategy / dataset-format authoring
Problem: Most repetitive contributor task. type: → module via importlib in
prompt_strategies/__init__.py. Every new format touches the same 3 places: a module in
prompt_strategies/, a golden test in tests/prompt_strategies/, a doc page in docs/dataset-formats/.
Skill: scaffold the strategy module, register the type:, generate the doc page, and emit a golden
round-trip test.
Validation gate: axolotl preprocess --debug for label-masking inspection + a generated test asserting
exact token IDs and the masked-label span (the loss-masking bug reviewers can't eyeball).
3. Integration / plugin scaffold
Problem: BasePlugin exposes ~20 lifecycle hooks (pre_model_load, post_model_build,
get_input_args, get_trainer_cls, create_optimizer…). Contributors silently skip hooks.
Skill: scaffold integrations/<name>/{__init__.py, args.py, README.md}, wire get_input_args, add a
config example + load/parse smoke test.
Validation gate: plugin loads, args parse, lifecycle hooks no-op cleanly.
4. Config-field addition
Problem: Bounded but easy to under-test in a config-driven project. docs/config-reference is
auto-generated (docs/scripts/generate_config_docs.py), so the surface is: Pydantic schema in
utils/schemas/, validator, regen docs, test.
Skill: add the field + validator, regenerate docs, scaffold a schema test.
Validation gate: schema round-trips, validator rejects bad combos, docs regenerate clean.
5. "Patch-retirement" check
Problem: Bumps are one-line + CI, but CI won't tell you which entries in monkeypatch/ and integration
shims are now redundant because upstream transformers/trl/peft caught up.
Skill: scan monkeypatch/ + shims, flag candidate-removable patches after a bump.
Validation gate: for each flagged patch, confirm upstream now provides equivalent behavior (test passes
with the patch disabled). Note: largely subsumed by #1 for the Liger elif chain.
6. Chat-template generator
Problem: tests/prompt_strategies/ already has a large chat-template suite (test_chat_templates*.py —
thinking, tool-calls, mistral…). Adding a new template repeats an established pattern.
Skill: add a new chat template + auto-generate its rendered + tokenized snapshot test.
Validation gate: snapshot matches; tool-call / thinking / system-prompt variants render correctly.
Low-risk, high-acceptance.
7. Model verification harness
Problem: Threads the maintainer's needle — the bespoke part of model support (attention/RoPE/mask
patches) can't be one skill, but verifying a model can.
Skill: given a model, scaffold tiny LoRA + full configs against a tiny variant, run preprocess + a few
train steps.
Validation gate: loss is sane, sample-packing behaves, Liger/CCE compat flags resolve correctly. A
reproducible smoke gate for model PRs.
---**
✔️ Solution
Was chatting with @NanoCode012 about whether any skills are worth adding getting inspiration from Liger
❓ Alternatives
There are other dev/testing oriented skills consistent across pytorch/sglang/nemo and others that are valid but they would be lesser value:
Training-run diagnostic . The single most common pattern across all four repos: NeMo debug-training-logs, SGLang debug-distributed-hang + debug-cuda-crash, PyTorch distributed-triage + pt2-bug-basher. Axolotl already has the source material (docs/training_stability.qmd, docs/debugging.qmd); the skill turns that prose into a guided "loss is NaN / OOM / hang / loss not decreasing → diagnose" workflow. Glue over existing docs + commands, no new logic.
Issue → minimal-repro triage. PyTorch triaging-issues/scrub-issue/fix-issue, Megatron respond-to-issue. Take a user's failing config, produce a minimal repro (tiny model + tiny dataset, preserving the SFT/DPO/GRPO/LoRA/multimodal path).
📝 Additional Context
No response
Acknowledgements
Inspired by Liger's
liger-kernel-devClaude skill (commit).The reusable pattern: analyze input → structured profile → generate known files → hard validation gate.
Underlying Logic: Good skill candidates are tasks that (a) touch the same handful of files every time and (b) have a
correctness check expressible as concrete commands or a generated test.
1. Liger upstream-sync
Problem:
integrations/liger/plugin.pyreconciles upstreamMODEL_TYPE_TO_APPLY_LIGER_FNagainst ahand-maintained override set (
axolotl_override_liger_fn, line ~85) and a longelif model_config_type ==chain of bespoke patches (qwen3_5, gemma4, jamba, deepseek_v2, llama4…). When Liger ships native support for
a type Axolotl still hand-patches, CI stays green while the wrong kernel set is applied — a silent
correctness/perf regression, not a version error. (Seen for real in the liger-0.8.0 bump: qwen3_5 /
qwen3_5_moe / gemma4_text override shadowing.)
Skill: diff installed
MODEL_TYPE_TO_APPLY_LIGER_FNagainst theelifchain + override set; report"upstream now natively supports X, Axolotl still hand-patches it" and "upstream changed the
apply_liger_kernel_to_Xsignature (staleswiglu/gluparam check)."Validation gate: for each covered
model_config_type, assert which path fires (native vs override) andthat the expected modules (RMSNorm, SwiGLUMLP, FLCE forward) were actually swapped.
2. Prompt-strategy / dataset-format authoring
Problem: Most repetitive contributor task.
type:→ module via importlib inprompt_strategies/__init__.py. Every new format touches the same 3 places: a module inprompt_strategies/, a golden test intests/prompt_strategies/, a doc page indocs/dataset-formats/.Skill: scaffold the strategy module, register the
type:, generate the doc page, and emit a goldenround-trip test.
Validation gate:
axolotl preprocess --debugfor label-masking inspection + a generated test assertingexact token IDs and the masked-label span (the loss-masking bug reviewers can't eyeball).
3. Integration / plugin scaffold
Problem:
BasePluginexposes ~20 lifecycle hooks (pre_model_load,post_model_build,get_input_args,get_trainer_cls,create_optimizer…). Contributors silently skip hooks.Skill: scaffold
integrations/<name>/{__init__.py, args.py, README.md}, wireget_input_args, add aconfig example + load/parse smoke test.
Validation gate: plugin loads, args parse, lifecycle hooks no-op cleanly.
4. Config-field addition
Problem: Bounded but easy to under-test in a config-driven project.
docs/config-referenceisauto-generated (
docs/scripts/generate_config_docs.py), so the surface is: Pydantic schema inutils/schemas/, validator, regen docs, test.Skill: add the field + validator, regenerate docs, scaffold a schema test.
Validation gate: schema round-trips, validator rejects bad combos, docs regenerate clean.
5. "Patch-retirement" check
Problem: Bumps are one-line + CI, but CI won't tell you which entries in
monkeypatch/and integrationshims are now redundant because upstream transformers/trl/peft caught up.
Skill: scan
monkeypatch/+ shims, flag candidate-removable patches after a bump.Validation gate: for each flagged patch, confirm upstream now provides equivalent behavior (test passes
with the patch disabled). Note: largely subsumed by #1 for the Liger
elifchain.6. Chat-template generator
Problem:
tests/prompt_strategies/already has a large chat-template suite (test_chat_templates*.py—thinking, tool-calls, mistral…). Adding a new template repeats an established pattern.
Skill: add a new chat template + auto-generate its rendered + tokenized snapshot test.
Validation gate: snapshot matches; tool-call / thinking / system-prompt variants render correctly.
Low-risk, high-acceptance.
7. Model verification harness
Problem: Threads the maintainer's needle — the bespoke part of model support (attention/RoPE/mask
patches) can't be one skill, but verifying a model can.
Skill: given a model, scaffold tiny LoRA + full configs against a tiny variant, run preprocess + a few
train steps.
Validation gate: loss is sane, sample-packing behaves, Liger/CCE compat flags resolve correctly. A
reproducible smoke gate for model PRs.
---**
✔️ Solution
Was chatting with @NanoCode012 about whether any skills are worth adding getting inspiration from Liger
❓ Alternatives
There are other dev/testing oriented skills consistent across pytorch/sglang/nemo and others that are valid but they would be lesser value:
Training-run diagnostic . The single most common pattern across all four repos: NeMo debug-training-logs, SGLang debug-distributed-hang + debug-cuda-crash, PyTorch distributed-triage + pt2-bug-basher. Axolotl already has the source material (docs/training_stability.qmd, docs/debugging.qmd); the skill turns that prose into a guided "loss is NaN / OOM / hang / loss not decreasing → diagnose" workflow. Glue over existing docs + commands, no new logic.
Issue → minimal-repro triage. PyTorch triaging-issues/scrub-issue/fix-issue, Megatron respond-to-issue. Take a user's failing config, produce a minimal repro (tiny model + tiny dataset, preserving the SFT/DPO/GRPO/LoRA/multimodal path).
📝 Additional Context
No response
Acknowledgements