Skip to content

Add inference-time-distillation-kl-clipping quickstart#3327

Open
ayirpown wants to merge 2 commits into
Snowflake-Labs:masterfrom
ayirpown:inference-time-distillation-kl-clipping
Open

Add inference-time-distillation-kl-clipping quickstart#3327
ayirpown wants to merge 2 commits into
Snowflake-Labs:masterfrom
ayirpown:inference-time-distillation-kl-clipping

Conversation

@ayirpown
Copy link
Copy Markdown
Contributor

This quickstart walks you through a single self-contained Snowflake Notebook that trains a small student LLM via two on-policy distillation methods, evaluates them rigorously,
and exposes the gradient-geometry intuition behind why one of them needs a stability trick.

You will run, on a Snowflake GPU compute pool:

- **OPD (On-Policy Distillation)** — student samples its own rollouts; a frozen Mixtral-8x7B teacher provides per-token reverse-KL signal on those rollouts.
- **OPSD (On-Policy Self-Distillation)** — same student plus a *self*-teacher: a frozen copy of the student that is given the gold answer in its system prompt (the "privileged

information" trick).
- KL clipping — the small change that stops OPSD from collapsing.
- Unified meta-algorithm — one unified_step function that reproduces SFT-RS, RL-outcome, OPD, and OPSD by toggling three knobs (alpha, lambda, pi_T).
- A calibrated PRM judge over the Cortex REST API — process reward model with deterministic step segmentation, hash-keyed cache, and Spearman/AUROC calibration against
verifier ground truth.
- A statistically honest evaluation suite — Wilson 95% confidence intervals, McNemar paired tests, bootstrap CIs on the gap, pass@k, and difficulty bucketing.

@github-actions
Copy link
Copy Markdown
Contributor

👋 Helpful Information

💡 Non-Markdown Files in Quickstart Root

Only markdown files (.md) are typically placed in the root of quickstart folders. Other file types should be placed in the /assets folder for better organization.

  • site/sfguides/src/inference-time-distillation-kl-clipping/inference_time_distillation_kl_clipping.ipynb

These are informational messages and will not block your PR from being merged.

Copy link
Copy Markdown
Collaborator

@sfc-gh-ridasafdar sfc-gh-ridasafdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ayirpown all non quickstart .md files need to be in an /assets subfolder

@github-actions
Copy link
Copy Markdown
Contributor

👋 Helpful Information

💡 Non-Image Files in Assets Folder

Non-image files found in /assets folders will NOT be uploaded to snowflake.com. If you are referencing them in your guide, you should link to them directly (e.g. using permanent GitHub links).

  • site/sfguides/src/inference-time-distillation-kl-clipping/assets/inference_time_distillation_kl_clipping.ipynb

These are informational messages and will not block your PR from being merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants