Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
-
Updated
Jun 1, 2026 - Python
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Official Implementation of VideoDPO
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
ZYN: Zero-Shot Reward Models with Yes-No Questions
Synthetic data for fine tuning LLM
Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)
GAN-style self-improvement loop for any text artifact: mutate, grade with a SEPARATE model, keep only verified wins (pairwise-judged), revert the rest. The git history is the improvement log.
distilled Self-Critique refines the outputs of a LLM with only synthetic data
RewardAnything: Generalizable Principle-Following Reward Models
Production-ready RLAIF trading system with multi-agent Claude AI that learns from market outcomes. Features 60+ indicators, foundation models, and serverless deployment.
Code for the paper "Improving Socratic Question Generation using Data Augmentation and Preference Optimization"
RankPO: Rank Preference Optimization
🧠 Enhance AI conversations with Cognio, a persistent memory server that retains context and enables meaningful semantic search across sessions.
(Stepwise controlled Understanding for Trajectories) -- “agent that learns to hunt"
RLAF: Reinforcement Learning from Agentic Feedback - A unified framework for training AI agents with multi-perspective critic ensembles
🤖 Train AI agents effectively with RLAF, utilizing multi-perspective critic ensembles for richer feedback and improved performance in reinforcement learning.
Production-grade CLI coding agent built from scratch — ReAct loop, 11 tools, RLAIF scoring via Grok 4, cross-session memory, JSONL tracing. 7.5× faster than LangChain baseline.
Add a description, image, and links to the rlaif topic page so that developers can more easily learn about it.
To associate your repository with the rlaif topic, visit your repo's landing page and select "manage topics."