Skip to content

Conversation

@DavidePaglieri
Copy link
Contributor

@DavidePaglieri DavidePaglieri commented Jan 21, 2025

Change the message to feedback for invalid actions. This would cause problems in CoT models, that would get confused when seeing their thinking replicated in the observation.

Also added the possibility to disable the feedback entirely by using the argument eval.feedback_on_invalid_action=False

@DavidePaglieri DavidePaglieri merged commit c347aa8 into main Jan 28, 2025
4 checks passed
@DavidePaglieri DavidePaglieri deleted the feat/feedback branch February 27, 2025 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants