Skip to content

Commit 2759cfb

Browse files
committed
fix(exploration): typo self.ago -> self.algo in sample_func_logits
In MightyExplorationPolicy.sample_func_logits, the sac flag passed to sample_nondeterministic_logprobs used self.ago instead of self.algo. Because self.ago does not exist, this always resolved to False, meaning the tanh-squash correction was never applied for SAC in this code path.
1 parent 29ae632 commit 2759cfb

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

mighty/mighty_exploration/mighty_exploration_policy.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ def sample_func_logits(self, state_array):
111111
elif isinstance(out, tuple) and len(out) == 4:
112112
action = out[0] # [batch, action_dim]
113113
log_prob = sample_nondeterministic_logprobs(
114-
z=out[1], mean=out[2], log_std=out[3], sac=self.ago == "sac"
114+
z=out[1], mean=out[2], log_std=out[3], sac=self.algo == "sac"
115115
)
116116
return action.detach().cpu().numpy(), log_prob
117117

0 commit comments

Comments
 (0)