Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bolt bug: openmaps Part 1 (fixes small bugs, improves prompts, culls noise) #17

Merged
merged 2 commits into from
Jan 21, 2025

Conversation

Domiii
Copy link
Collaborator

@Domiii Domiii commented Jan 20, 2025

@Domiii Domiii self-assigned this Jan 20, 2025
@Domiii Domiii changed the title Bolt bug: openmaps Part 1 Bolt bug: openmaps Part 1 (fixes small bugs, improves prompts, culls noise) Jan 20, 2025
text = """
You have concluded the analysis.

IMPORTANT: NOW review, then implement the hypothesized changes using tools. The code is available in the workspace. Start by answering these questions:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: This self-consistency check helps tremendously. I have another TODO that'll fortify this aspect of our state machine.

@@ -173,6 +173,10 @@
parameters={
'type': 'object',
'properties': {
'problem': {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: Accuracy-improving self-consistency.

@@ -312,11 +312,13 @@ async def _handle_observation(self, observation: Observation) -> None:
elif isinstance(observation, ReplayPhaseUpdateObservation):
new_phase = observation.new_phase
if self.state.replay_phase == new_phase:
raise ValueError(
f'Unexpected ReplayPhaseUpdateAction. Already in phase: {new_phase}'
self.log(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: Since we ask it to verify its own hypothesis, it sometimes tries to re-submit it, which takes us here. No reason to throw in that case. (Although it is weird because it technically has no longer access to the tool. Might want to debug why it can even this sometime.)

asyncio.run(
run_controller(
config=config,
initial_user_action=initial_user_action,
sid=sid,
exit_on_message=True,
fake_user_response_fn=fake_user_response,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: Sometimes it enters the AWAITING_USER_INPUT state. This fixes it. resolve_issue.py does the same.

suffix = f'* This bug had a timetravel debugger recording.\n* Use below `Initial Analysis` and the timetravel debugger `inspect-*` tools to find the bug.\n* Once found, `submit-hypothesis`, so your analysis can be used to implement the solution.\n\n## Initial Analysis\n{json.dumps(data, indent=2)}'
suffix = """
# Instructions
0. Take a look at below `Initial Analysis`, based on a recorded trace of the bug. Pay special attention to `IMPORTANT_NOTES`.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTEs: This works!

  • It indeed pays very careful attention to IMPORTANT_NOTES, which we can now add into analysis data.
  • It does state the main problem and propose a plan (again: for self-consistency).

@@ -145,7 +145,9 @@ def setup_initial_env(self) -> None:
{
'REPLAY_API_KEY': self.config.replay.api_key,
'REPLAY_DEV_MODE': os.environ.get('REPLAY_DEV_MODE', ''),
'REPLAY_ENABLE_TOOL_CACHE': os.environ.get('REPLAY_DEV_MODE', ''),
'REPLAY_ENABLE_TOOL_CACHE': os.environ.get(
'REPLAY_ENABLE_TOOL_CACHE', ''
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(NOTE: Whoops)

@Domiii Domiii merged commit bb77b47 into main Jan 21, 2025
7 of 9 checks passed
@Domiii Domiii deleted the dominik/pro-951-bolt-bug-openmaps-1 branch January 21, 2025 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant