Week 3: Supervised Fine-Tuning on the Hub

Fine-tune and share models on the Hub. Take a base model, train it on your data, and publish the result for the community to use.

Why This Matters

Fine-tuning is how we adapt foundation models to specific tasks. By sharing fine-tuned models—along with your training methodology—you're giving the community ready-to-use solutions and reproducible recipes they can learn from.

The Skill

Use hugging-face-model-trainer/ for this quest. Key capabilities:

SFT (Supervised Fine-Tuning) — Standard instruction tuning
DPO (Direct Preference Optimization) — Alignment from preference data
GRPO (Group Relative Policy Optimization) — Online RL training
Cloud GPU training on HF Jobs—no local setup required
Trackio integration for real-time monitoring
GGUF conversion for local deployment

Your coding agent uses hf_jobs() to submit training scripts directly to HF infrastructure.

XP Tiers

We'll announce the XP tiers for this quest soon.

Resources

SKILL.md — Full skill documentation
SFT Example — Production SFT template
DPO Example — Production DPO template
GRPO Example — Production GRPO template
Training Methods — Method selection guide
Hardware Guide — GPU selection

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Week 3: Supervised Fine-Tuning on the Hub

Why This Matters

The Skill

XP Tiers

Resources

FilesExpand file tree

04_sft-finetune-hub.md

Latest commit

History

04_sft-finetune-hub.md

File metadata and controls

Week 3: Supervised Fine-Tuning on the Hub

Why This Matters

The Skill

XP Tiers

Resources