-
Notifications
You must be signed in to change notification settings - Fork 0
Add my custom ci-bot variant #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+183
−9
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
8872e0d
Add my custom ci-bot variant
Aaron1011 636b527
experimentation?
virajmehta b765b83
Merge branch 'main' of github.com:tensorzero/experimental-ci-bot into…
virajmehta e43ca85
added all experimentation
virajmehta 9c3cbc3
config file passes validation
virajmehta 01e137e
removed changes to original config
virajmehta File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
24 changes: 24 additions & 0 deletions
24
tensorzero/swe_agent_config/aaron_templates/action_observation.minijinja
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| <returncode>{{output.returncode}}</returncode> | ||
| {% if output.output | length < 10000 -%} | ||
| <output> | ||
| {{ output.output -}} | ||
| </output> | ||
| {%- else -%} | ||
| <warning> | ||
| The output of your last command was too long. | ||
| Please try a different command that produces less output. | ||
| If you're looking at a file, use head, tail, or sed to view a smaller number of lines selectively. | ||
| If you're using grep or find and it produced too much output, use a more selective search pattern. | ||
| If you really need to see something from the full output, redirect it to a file and then search in that file. | ||
| </warning> | ||
| {%- set elided_chars = output.output | length - 10000 -%} | ||
| <output_head> | ||
| {{ output.output[:5000] }} | ||
| </output_head> | ||
| <elided_chars> | ||
| {{ elided_chars }} characters elided | ||
| </elided_chars> | ||
| <output_tail> | ||
| {{ output.output[-5000:] }} | ||
| </output_tail> | ||
| {%- endif -%} |
23 changes: 23 additions & 0 deletions
23
tensorzero/swe_agent_config/aaron_templates/format_error.minijinja
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| Please always provide EXACTLY ONE action in triple backticks, found {{actions|length}} actions. | ||
|
|
||
| If you want to end the task, use the completion command: | ||
|
|
||
| ```bash | ||
| echo "COMPLETE_TASK_AND_SUBMIT_FINAL_OUTPUT | ||
| REASONING: [Your reasoning here]" | ||
| ``` | ||
|
|
||
| Do not combine the completion command with any other command. | ||
|
|
||
| Otherwise, format your response exactly as follows: | ||
|
|
||
| <response_example> | ||
| THOUGHT: Your reasoning about why you want to perform this action. | ||
|
|
||
| ```bash | ||
| <your_command_here> | ||
| ``` | ||
| </response_example> | ||
|
|
||
| Note: In rare cases, if you need to reference triple backticks in your command, proceed in two steps: | ||
| first write TRIPLEBACKTICKSBASH, then replace it with ```bash in a subsequent command. |
89 changes: 89 additions & 0 deletions
89
tensorzero/swe_agent_config/aaron_templates/instance.minijinja
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,89 @@ | ||
| {{task}} | ||
|
|
||
| ## Your Mission | ||
|
|
||
| Your goal is to: | ||
| 1. Read and understand the CI failure information provided in `ci_failure_context.md` | ||
| 2. Make targeted fixes to resolve the failing tests/checks | ||
| 3. Validate your fixes by running the appropriate tests locally | ||
|
|
||
| ## Validation Requirements | ||
|
|
||
| After making changes, you MUST validate them by running: | ||
| - The specific failing tests (to ensure they now pass) | ||
| - Linters and formatters (eslint, prettier, black, ruff, cargo fmt, etc.) | ||
| - The build process (npm run build, cargo build, etc.) | ||
| - Language-specific checks (cargo check, cargo clippy, tsc --noEmit, etc.) | ||
|
|
||
| Your response must contain exactly ONE bash code block with ONE command (or commands connected with && or ||). | ||
| Include a THOUGHT section before your command where you explain your reasoning process. | ||
| Format your response as shown in <format_example>. | ||
|
|
||
| <format_example> | ||
| Your reasoning and analysis here. Explain why you want to perform the action. | ||
|
|
||
| ```bash | ||
| your_command_here | ||
| ``` | ||
| </format_example> | ||
|
|
||
| Failure to follow these rules will cause your response to be rejected. | ||
|
|
||
| ## Completion Signal | ||
|
|
||
| When you are done and have validated your fix, signal completion: | ||
|
|
||
| ```bash | ||
| echo "COMPLETE_TASK_AND_SUBMIT_FINAL_OUTPUT | ||
| REASONING: Brief explanation of the changes you made and what you fixed" | ||
| ``` | ||
|
|
||
| Do not combine the completion command with any other command. | ||
|
|
||
| ## CI Failure Information | ||
|
|
||
| The CI failure details are available in the file `ci_failure_context.md` in the current directory. | ||
| Read this file first to understand what failed and why. | ||
|
|
||
| ## Recommended Workflow | ||
|
|
||
| Work step-by-step to ensure you can iterate on your changes and catch any problems: | ||
|
|
||
| 1. **Read the CI failure context** - `cat ci_failure_context.md` | ||
| 2. **Analyze the codebase** - Find and read relevant files mentioned in the failure | ||
| 3. **Understand the root cause** - Identify why the tests/checks are failing | ||
| 4. **Create a reproduction script** (if applicable) - Verify you can reproduce the failure locally | ||
| 5. **Make targeted fixes** - Edit the source code to resolve the issue | ||
| 6. **Run validation** - Execute the failing tests, linters, and build to verify your fix | ||
| 7. **Iterate if needed** - If validation fails, debug and fix until all checks pass | ||
| 9. **Submit your work** - Signal completion using the completion signal | ||
|
|
||
| ## Important Rules | ||
|
|
||
| 1. Every response must contain exactly one action in triple backticks | ||
| 2. Directory or environment variable changes are not persistent - every action runs in a new subshell | ||
| 3. You can prefix commands with environment variables or directory changes: `cd /path && command` | ||
| 4. If a command needs more time, add '# timeout: <seconds>' on the first line (max {{max_timeout}} seconds). | ||
|
|
||
| <system_information> | ||
| {{system}} {{release}} {{version}} {{machine}} | ||
| </system_information> | ||
|
|
||
| ## Example Session | ||
|
|
||
| <example_response> | ||
| THOUGHT: I need to first read the CI failure context to understand what went wrong in the pull request. | ||
|
|
||
| ```bash | ||
| cat ci_failure_context.md | ||
| ``` | ||
| </example_response> | ||
|
|
||
| ## With max_timeout | ||
| ```bash | ||
| # timeout: 300 | ||
| uv run expensive_script.py | ||
| ``` | ||
|
|
||
| Now begin your work! | ||
| Do not commit to git, just signal completion when you are happy with the state of the project. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| You are an expert software engineer helping to fix CI failures in a GitHub pull request. | ||
virajmehta marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.