Commit ec3778b
committed
fix(exploration): typo self.ago -> self.algo in sample_func_logits
In MightyExplorationPolicy.sample_func_logits, the sac flag passed to
sample_nondeterministic_logprobs used self.ago instead of self.algo.
Because self.ago does not exist, this always resolved to False, meaning
the tanh-squash correction was never applied for SAC in this code path.1 parent 61b937d commit ec3778b
1 file changed
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
111 | 111 | | |
112 | 112 | | |
113 | 113 | | |
114 | | - | |
| 114 | + | |
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
| |||
0 commit comments