Skip to content

fix: restore parameters on TRPO line search failure#1287

Open
Mr-Neutr0n wants to merge 1 commit intothu-ml:masterfrom
Mr-Neutr0n:fix/trpo-line-search-restore-params
Open

fix: restore parameters on TRPO line search failure#1287
Mr-Neutr0n wants to merge 1 commit intothu-ml:masterfrom
Mr-Neutr0n:fix/trpo-line-search-restore-params

Conversation

@Mr-Neutr0n
Copy link

Bug

When the TRPO line search fails to find a step that satisfies the KL constraint, the original policy parameters are not restored. This leaves the policy with an untested update that violates the trust region.

Fix

Added parameter restoration to the line search fallback path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant