Skip to content

Commit 1b48219

Browse files
lc5211The tunix Authors
authored andcommitted
[Tunix] Initialize policy version from global steps.
PiperOrigin-RevId: 877454612
1 parent 6a9ef6f commit 1b48219

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

tunix/rl/experimental/agentic_rl_learner.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,7 @@ def __init__(
212212

213213
self.chat_parser = chat_parser
214214
self.tokenizer = rl_cluster.tokenizer
215-
self.policy_version = 0
215+
self.policy_version = self.rl_cluster.global_steps
216216
self._rollout_sync_lock = agentic_utils.RolloutSyncLock()
217217
self._full_batch_size = 0
218218

0 commit comments

Comments
 (0)