Agent Traces Pipeline by baptistecolle · Pull Request #565 · huggingface/open-r1

baptistecolle · 2025-03-31T11:39:00Z

What does this PR do?

This PR adds a pipeline to train an agent on agentic traces from smolagents

How does it work?

Step 1) We generated traces from r1 on from the open-r1/codeforces-cots dataset.
Step 2) The traces are then filtered to only keep traces that lead to solutions that pass the codeforces test-cases.
Step 3) Train a Qwen2.5-1.5B with SFT on the traces from R1.
Step 4) Evaluate the performance improvement of the base with the new traces training

The dataset of traces and the trained model are uploaded to the hub at:

https://huggingface.co/datasets/baptistecolle/codeforces-agentic-generations
https://huggingface.co/baptistecolle/Qwen2.5-1.5B-Open-R1-Distill-Agentic-Trace
(TODO: to move that to open-r1 org)

This is currently a WIP, some parts of the pipeline are not behaving correctly. This could then lead to a nice blog post.

aymeric-roucher added 30 commits March 31, 2025 11:39

Start agent traces

e7df036

Working local version with o1

6d0963e

Update api addr

a6f5a15

Increase concurrent requests

38bfa93

Update sbatch params

7d9fc6e

Add conda activation

7a1fb98

Use local model

1a7becf

128 concurrent

f35337e

Log

28bc464

Add conda init

319ae52

Fix slurm script

69d55f6

Add await

c8aa2c4

Try fixing async func

6df6161

Add stop sequences

b402450

Add port

b2996c1

Make synchronous

f6f138b

Small adapts to script

23c2128

More detailed error logging

52ac4e2

Even more detailed request error logging

0adc082

Reduce context length

884c8e9

Add token counting

64ae551

Fix message roles an add token counting

2e7d1da

Add dummy completion

7bcb96e

Test

28afbef

Running with gpt-4o

5ed2005

Update timeouts

ce7d8bd

Adjust

6a9db1b

Flatten messages

e245aa0

Prompt more around testing the function

b6de9cb

Improve explanations in prompt

9cdf0d9

aymeric-roucher and others added 2 commits March 31, 2025 11:39

Also store final outputs

ef3f888

wip(generate + eda): working generation + add initial eda

91e4dc1

baptistecolle force-pushed the agent-traces-v2 branch from f798a19 to 91e4dc1 Compare March 31, 2025 11:39

baptistecolle added 5 commits March 31, 2025 12:48

feat(eda): uploaded dataset for training

5d7205d

feat(train): added training recipe for agentic traces

c1cea15

fix(deps): fix smolagent dep

8cc3983

fix(deps): fix smolagent dep

3b021de

fix: remove uncessary changes

2fbac03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Agent Traces Pipeline#565

Agent Traces Pipeline#565
baptistecolle wants to merge 37 commits intomainfrom
agent-traces-v2

baptistecolle commented Mar 31, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

baptistecolle commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

How does it work?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

baptistecolle commented Mar 31, 2025 •

edited

Loading