Commit dbbbded
feat(slime): make Sample.metadata the agent payload verbatim
Collapses _sample_to_payload to return a shallow copy of
Sample.metadata. Previously it synthesized a hybrid payload shape
(sample.prompt -> payload["prompt"], sample.label -> payload["answer"],
sample.metadata nested under payload["metadata"], plus a fall-through
copy of Sample fields), which locked the slime backend into the math
agent's shape and forced other agents (appworld, migration,
officebench) into workarounds.
After this change, the JSONL row's metadata dict is the agent payload
exactly, so each agent declares whatever payload shape it wants by
choosing what keys to put in metadata. The JSONL top-level prompt
field still drives slime's tokenizer and length filter.
Breaking change for existing math JSONLs: rows using {prompt, label}
now produce an empty payload. Regenerate with the updated SETUP.md
data-prep snippet which emits {prompt, metadata: {prompt, answer}}.
Also drops --label-key from train.sh (nothing reads sample.label
under the new rule).
Verified end-to-end on Qwen2.5-3B-Instruct + GSM8K with NUM_ROLLOUT=10:
raw_reward climbed 0.27 -> 0.63, train/loss and grad_norm move as
expected, no rollout failures.
Plan: docs/roadmap/committed/slime-data-contract.md (committed on
docs/core-api-rename-roadmap in PR #59).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent 53eaafe commit dbbbded
5 files changed
Lines changed: 147 additions & 48 deletions
File tree
- src/agentcore_rl_toolkit/backends/slime
- examples/math_agent
- integration
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
112 | 112 | | |
113 | 113 | | |
114 | 114 | | |
| 115 | + | |
115 | 116 | | |
116 | | - | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
117 | 123 | | |
118 | 124 | | |
119 | 125 | | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
120 | 131 | | |
121 | 132 | | |
122 | 133 | | |
| |||
Lines changed: 0 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
83 | | - | |
84 | 83 | | |
85 | 84 | | |
86 | 85 | | |
| |||
Lines changed: 13 additions & 45 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
209 | 209 | | |
210 | 210 | | |
211 | 211 | | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
216 | 221 | | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
258 | 226 | | |
259 | 227 | | |
260 | 228 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
0 commit comments