-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
Description
Hello! I'd like to suggest adding CRMA Fine-Tuner to the LLM Fine-tuning section.
What it is:
A free fine-tuning tool (HuggingFace Space, no GPU required) for TinyLlama, Gemma, and Mistral that includes a built-in gradient stability layer on top of standard QLoRA/LoRA training.
Technical contribution:
We identified and documented a reproducible gradient norm spike in QLoRA at step ~44 on Mistral-7B (gn=15.28 vs normal ~1.0), caused by quantization error accumulating in the backward pass. The tool addresses this with:
- Adaptive gradient clipping via rolling z-score over gradient norm history (vs. static
max_grad_norm) - Spectral norm constraint on LoRA weight updates
Ablation results vs baseline (Mistral-7B, n=5 runs):
- QLoRA baseline: peak gn=15.28, -20.5% output quality degradation
- With adaptive clip + spectral norm: peak gn=1.9, -1.1% degradation, 0/5 spike rate
Links:
- Tool: https://huggingface.co/spaces/Fourwheels2512/crma-fine-tuner
- Technical writeup: https://dev.to/fourwheels2512/why-qlora-produces-a-gradient-norm-spike-at-step-44-on-mistral-7b-and-how-to-fix-it-141h
Suggested placement: Fine-tuning section (alongside Axolotl, Unsloth, LLaMA-Factory)
Happy to open a PR if that's preferred!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels