Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent Traces Pipeline #565

Draft
wants to merge 37 commits into
base: main
Choose a base branch
from
Draft

Agent Traces Pipeline #565

wants to merge 37 commits into from

Conversation

baptistecolle
Copy link

@baptistecolle baptistecolle commented Mar 31, 2025

What does this PR do?

This PR adds a pipeline to train an agent on agentic traces from smolagents

How does it work?

Step 1) We generated traces from r1 on from the open-r1/codeforces-cots dataset.
Step 2) The traces are then filtered to only keep traces that lead to solutions that pass the codeforces test-cases.
Step 3) Train a Qwen2.5-1.5B with SFT on the traces from R1.
Step 4) Evaluate the performance improvement of the base with the new traces training

The dataset of traces and the trained model are uploaded to the hub at:

This is currently a WIP, some parts of the pipeline are not behaving correctly. This could then lead to a nice blog post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants