-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bolt bug: openmaps Part 1 (fixes small bugs, improves prompts, culls noise) #17
Conversation
…ncy-enforcing state machine
openmaps
Part 1text = """ | ||
You have concluded the analysis. | ||
|
||
IMPORTANT: NOW review, then implement the hypothesized changes using tools. The code is available in the workspace. Start by answering these questions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: This self-consistency check helps tremendously. I have another TODO that'll fortify this aspect of our state machine.
@@ -173,6 +173,10 @@ | |||
parameters={ | |||
'type': 'object', | |||
'properties': { | |||
'problem': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: Accuracy-improving self-consistency.
@@ -312,11 +312,13 @@ async def _handle_observation(self, observation: Observation) -> None: | |||
elif isinstance(observation, ReplayPhaseUpdateObservation): | |||
new_phase = observation.new_phase | |||
if self.state.replay_phase == new_phase: | |||
raise ValueError( | |||
f'Unexpected ReplayPhaseUpdateAction. Already in phase: {new_phase}' | |||
self.log( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: Since we ask it to verify its own hypothesis, it sometimes tries to re-submit it, which takes us here. No reason to throw in that case. (Although it is weird because it technically has no longer access to the tool. Might want to debug why it can even this sometime.)
asyncio.run( | ||
run_controller( | ||
config=config, | ||
initial_user_action=initial_user_action, | ||
sid=sid, | ||
exit_on_message=True, | ||
fake_user_response_fn=fake_user_response, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: Sometimes it enters the AWAITING_USER_INPUT
state. This fixes it. resolve_issue.py
does the same.
suffix = f'* This bug had a timetravel debugger recording.\n* Use below `Initial Analysis` and the timetravel debugger `inspect-*` tools to find the bug.\n* Once found, `submit-hypothesis`, so your analysis can be used to implement the solution.\n\n## Initial Analysis\n{json.dumps(data, indent=2)}' | ||
suffix = """ | ||
# Instructions | ||
0. Take a look at below `Initial Analysis`, based on a recorded trace of the bug. Pay special attention to `IMPORTANT_NOTES`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTEs: This works!
- It indeed pays very careful attention to
IMPORTANT_NOTES
, which we can now add into analysis data. - It does state the main problem and propose a plan (again: for self-consistency).
@@ -145,7 +145,9 @@ def setup_initial_env(self) -> None: | |||
{ | |||
'REPLAY_API_KEY': self.config.replay.api_key, | |||
'REPLAY_DEV_MODE': os.environ.get('REPLAY_DEV_MODE', ''), | |||
'REPLAY_ENABLE_TOOL_CACHE': os.environ.get('REPLAY_DEV_MODE', ''), | |||
'REPLAY_ENABLE_TOOL_CACHE': os.environ.get( | |||
'REPLAY_ENABLE_TOOL_CACHE', '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(NOTE: Whoops)
Fixes small bugs, improves prompts, culls noise.