Add inference-time-distillation-kl-clipping quickstart#3327
Open
ayirpown wants to merge 2 commits into
Open
Conversation
Contributor
👋 Helpful Information💡 Non-Markdown Files in Quickstart RootOnly markdown files (.md) are typically placed in the root of quickstart folders. Other file types should be placed in the /assets folder for better organization.
These are informational messages and will not block your PR from being merged. |
sfc-gh-ridasafdar
requested changes
May 29, 2026
Collaborator
sfc-gh-ridasafdar
left a comment
There was a problem hiding this comment.
@ayirpown all non quickstart .md files need to be in an /assets subfolder
Contributor
👋 Helpful Information💡 Non-Image Files in Assets FolderNon-image files found in /assets folders will NOT be uploaded to snowflake.com. If you are referencing them in your guide, you should link to them directly (e.g. using permanent GitHub links).
These are informational messages and will not block your PR from being merged. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This quickstart walks you through a single self-contained Snowflake Notebook that trains a small student LLM via two on-policy distillation methods, evaluates them rigorously,
and exposes the gradient-geometry intuition behind why one of them needs a stability trick.
information" trick).
- KL clipping — the small change that stops OPSD from collapsing.
- Unified meta-algorithm — one
unified_stepfunction that reproduces SFT-RS, RL-outcome, OPD, and OPSD by toggling three knobs(alpha, lambda, pi_T).- A calibrated PRM judge over the Cortex REST API — process reward model with deterministic step segmentation, hash-keyed cache, and Spearman/AUROC calibration against
verifier ground truth.
- A statistically honest evaluation suite — Wilson 95% confidence intervals, McNemar paired tests, bootstrap CIs on the gap, pass@k, and difficulty bucketing.