Skip to content

Comments

Agent Traces Pipeline#565

Draft
baptistecolle wants to merge 37 commits intomainfrom
agent-traces-v2
Draft

Agent Traces Pipeline#565
baptistecolle wants to merge 37 commits intomainfrom
agent-traces-v2

Conversation

@baptistecolle
Copy link

@baptistecolle baptistecolle commented Mar 31, 2025

What does this PR do?

This PR adds a pipeline to train an agent on agentic traces from smolagents

How does it work?

Step 1) We generated traces from r1 on from the open-r1/codeforces-cots dataset.
Step 2) The traces are then filtered to only keep traces that lead to solutions that pass the codeforces test-cases.
Step 3) Train a Qwen2.5-1.5B with SFT on the traces from R1.
Step 4) Evaluate the performance improvement of the base with the new traces training

The dataset of traces and the trained model are uploaded to the hub at:

This is currently a WIP, some parts of the pipeline are not behaving correctly. This could then lead to a nice blog post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants