Skip to content

Commit dfcf59e

Browse files
authored
Update actor.py
1 parent 102349e commit dfcf59e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

slime/backends/fsdp_utils/actor.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -486,7 +486,7 @@ def train(self, rollout_id: int, rollout_data_ref: Box) -> None:
486486
pg_clipfrac = sum_of_sample_mean(pg_clipfrac, response_lengths, loss_masks)
487487
ppo_kl = sum_of_sample_mean(ppo_kl.abs(), response_lengths, loss_masks)
488488

489-
train_rollout_logprob_diff = old_log_probs - rollout_log_probs
489+
train_rollout_logprob_diff = (old_log_probs - rollout_log_probs).abs()
490490
train_rollout_logprob_diff = sum_of_sample_mean(
491491
train_rollout_logprob_diff, response_lengths, loss_masks
492492
).detach()

0 commit comments

Comments
 (0)