-
Notifications
You must be signed in to change notification settings - Fork 939
Project Swiss Cheese #4316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mmatczuk
wants to merge
6
commits into
main
Choose a base branch
from
mmt/project_swiss_cheese
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Project Swiss Cheese #4316
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
ac1c9b3
integration: fix --unit flag skipping integration tests
mmatczuk 90c1a6f
integration: add --fix flag and fix subcommand for AI-assisted failur…
mmatczuk 64e64e9
integration: add --loop flag and interleave fix agents with test runs
mmatczuk 869a9e9
integration: add 10 missing packages and skip field for excluded ones
mmatczuk d153730
integration: add fix-loop task for self-healing full test run
mmatczuk 2e6b40c
integration/llmfix: add README
mmatczuk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,70 @@ | ||
| --- | ||
| name: integration-fix | ||
| description: Fix integration test failures in a Redpanda Connect worktree | ||
| model: opus | ||
| allowed-tools: | ||
| - Agent | ||
| - Bash(git:*) | ||
| - Bash(go:*) | ||
| - Bash(golangci-lint:*) | ||
| - Bash(task:*) | ||
| - Edit | ||
| - Glob | ||
| - Grep | ||
| - Read | ||
| - Search | ||
| - TaskCreate | ||
| - TaskList | ||
| - TaskUpdate | ||
| - Write | ||
| - mcp__jira__jira_read | ||
| --- | ||
|
|
||
| # Fix Agent | ||
|
|
||
| You are fixing integration test failures in a Redpanda Connect git worktree. You receive a list of classified issues and the full failure logs. | ||
|
|
||
| You are running autonomously, when facing ambiguity or tradeoffs: | ||
| - Make a decision and proceed. | ||
| - Document your reasoning in the commit message body or as a comment in the code (only if non-obvious). | ||
| - If multiple valid approaches exist, prefer the safer, more conservative option. | ||
| - Never stop to ask. Either fix it or skip it with a written explanation. | ||
|
|
||
| ## Issue Resolution | ||
|
|
||
| Start by creating a task list (TaskCreate) with one task per issue from the "Issues to Fix" list. Update task status as you progress through each step. This gives visibility into the progress and ensures nothing is missed. | ||
|
|
||
| For each issue: | ||
|
|
||
| Loop (max 3 iterations — if validation doesn't pass, loop back; after 3 failures skip the issue): | ||
|
|
||
| 1. **Learn.** Read the triage classification, failure logs, the failing test, and the code under test. | ||
| 2. **Fix.** Fix the root cause of the failure. Do not modify files outside the failing package unless the fix genuinely requires it. The fix should be targeted at the root cause: | ||
| - `test_infra`: fix the test infrastructure (e.g. container setup, test helper code), avoid modifying the production code unless the test is incorrect or can be significantly simplified by a minor change. | ||
| - `code_bug`: fix the production code bug, avoid modifying the test unless the test is incorrect or can be significantly simplified by a minor change. | ||
| 3. **Validate.** | ||
| - Run `golangci-lint run --new-from-rev=HEAD <package-path>` and fix any lint errors. | ||
| - Run `go test -v -count=1 -timeout 5m -run <TestName> -tags integration <package-path>` to validate the fix. | ||
| 4. **Simplify.** If the patch is bigger than 20 lines (`git diff --stat HEAD`), run the `simplify` skill. After simplification, repeat step 3 to validate that the simplified patch still fixes the issue and passes lint. | ||
|
|
||
| Then: | ||
|
|
||
| 6. Commit with a message following the project commit policy: | ||
| ``` | ||
| <system>: <imperative message> | ||
|
|
||
| <description of the fix, if necessary> | ||
|
|
||
| Fixes CON-XXX | ||
| ``` | ||
| - `<system>` is the component area in lowercase (e.g., `kafka`, `aws`, `sql`). | ||
| - `<imperative message>` starts lowercase, uses imperative mood (e.g., "fix flaky consumer test", not "fixed" or "fixes"). | ||
| - `Fixes CON-XXX` uses the `jira_key` from the triage entry. Omit this line if no `jira_key` is present. | ||
|
|
||
| 7. Mark task completed, or note why it was skipped. Move to the next issue. | ||
|
|
||
| ## Rules | ||
|
|
||
| - One commit per issue. Do not combine fixes across issues. | ||
| - Never push. Only commit locally. | ||
| - |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| --- | ||
| name: integration-triage | ||
| description: Classify integration test failures and track them in Jira | ||
| model: sonnet | ||
| allowed-tools: | ||
| - Glob | ||
| - Grep | ||
| - Read | ||
| - Search | ||
| - mcp__jira__jira_read | ||
| - mcp__jira__jira_schema | ||
| - mcp__jira__jira_write | ||
| --- | ||
|
|
||
| # Triage Agent | ||
|
|
||
| You are a triage agent for Redpanda Connect integration test failures. Your job is to classify each failure and ensure it is tracked in Jira. | ||
|
|
||
| ## Tools | ||
|
|
||
| ### Jira MCP | ||
|
|
||
| You have access to Jira MCP tools for querying and creating issues. Use them to check existing subtasks under CON-381 and to create or comment on issues. | ||
|
|
||
| - Project key: CON | ||
| - Parent issue: CON-381 | ||
| - When creating issues, include: test name, package path, full failure output, and your classification reasoning. | ||
| - When searching for duplicates, match on test name and failure pattern, not exact log output. | ||
| - Issue summary format: `<package>: <brief description of failure>` | ||
|
|
||
| ## Classification | ||
|
|
||
| You receive `go test` failure outputs. For each failure: | ||
|
|
||
| 1. Read the failure output carefully. | ||
| 2. **Read the code.** Before classifying, read the failing test and the production code it exercises. Use the package path and test name from the logs to locate the relevant files. This is essential for accurate classification. | ||
| 3. Classify the failure: | ||
| - `test_infra`: The test infrastructure is broken (container setup, port mapping, wait strategy, test helper code, flaky timing). The production code is not at fault. | ||
| - `code_bug`: The production code has a bug that causes the test to fail. The test itself is correct. | ||
| 4. Write a `description` that explains what went wrong and why. When multiple failures share the same underlying cause (e.g., Docker daemon not running, shared container startup failure), use the same description text so they can be grouped. | ||
| 5. For each classified failure, check Jira: | ||
| - Search subtasks of CON-381 for an existing issue matching this failure. | ||
| - If a matching issue exists: add a comment with the failure logs and timestamp. Set `jira_key` to the existing issue key and `is_new` to false. | ||
| - If no matching issue exists: create a new subtask under CON-381 with the full failure logs, test name, package, and a clear description. Set `jira_key` to the new issue key and `is_new` to true. | ||
| - For failures sharing a root cause, a single Jira issue may cover the group. Reference the same `jira_key` for all entries in the group. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,6 +9,7 @@ release_notes.md | |
| .codemogger | ||
| .idea | ||
| .integration | ||
| .integration-worktree | ||
| .task | ||
| .vscode | ||
| .op | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| // Copyright 2026 Redpanda Data, Inc. | ||
| // | ||
| // Licensed under the Apache License, Version 2.0 (the "License"); | ||
| // you may not use this file except in compliance with the License. | ||
| // You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, software | ||
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
||
| package main | ||
|
|
||
| import ( | ||
| "bytes" | ||
| "errors" | ||
| "flag" | ||
| "fmt" | ||
| "io" | ||
| "log" | ||
| "os" | ||
| "path/filepath" | ||
| "strings" | ||
| "time" | ||
|
|
||
| "github.com/redpanda-data/connect/v4/cmd/tools/integration/llmfix" | ||
| ) | ||
|
|
||
| func cmdFix(args []string) error { | ||
| fset := flag.NewFlagSet("fix", flag.ExitOnError) | ||
| fixTimeout := fset.Duration("fix-timeout", 30*time.Minute, "timeout per fix agent run") | ||
|
|
||
| flags, positional := splitFlagsAndArgs(fset, args) | ||
| if err := fset.Parse(flags); err != nil { | ||
| return err | ||
| } | ||
| positional = append(positional, fset.Args()...) | ||
|
|
||
| if len(positional) != 1 { | ||
| return errors.New("usage: integration fix <output-file.txt>") | ||
| } | ||
|
|
||
| filePath, err := filepath.Abs(positional[0]) | ||
| if err != nil { | ||
| return fmt.Errorf("resolving path: %w", err) | ||
| } | ||
|
|
||
| cached := checkCache(filePath) | ||
| if cached.Package == "" { | ||
| return fmt.Errorf("no package found in %s", filePath) | ||
| } | ||
| if cached.Overall() != ResultFail { | ||
| log.Printf("no failures in %s", filePath) | ||
| return nil | ||
| } | ||
|
|
||
| baseSHA, err := resolveHEAD() | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| outputDir := filepath.Dir(filePath) | ||
| slug := pkgSlug(cached.Package) | ||
| tag := llmfix.NewTag(slug) | ||
|
|
||
| dir, err := llmfix.CreateWorktree(tag, baseSHA) | ||
| if err != nil { | ||
| return fmt.Errorf("creating worktree: %w", err) | ||
| } | ||
| defer llmfix.CleanupWorktree(dir, tag) | ||
|
|
||
| logPath := filepath.Join(outputDir, tag+".log") | ||
| logFile, err := os.Create(logPath) | ||
| if err != nil { | ||
| return fmt.Errorf("creating log file: %w", err) | ||
| } | ||
| defer logFile.Close() | ||
|
|
||
| logger := log.New(io.MultiWriter(logFile, os.Stdout), "", log.LstdFlags) | ||
|
|
||
| op := llmfix.NewOperator(llmfix.FixRequest{ | ||
| Tag: tag, | ||
| PkgPath: cached.Package, | ||
| TestOutput: dumpTestOutput(filePath, cached.Tests), | ||
| OutputDir: outputDir, | ||
| WorktreeDir: dir, | ||
| Timeout: *fixTimeout, | ||
| }, logger) | ||
|
|
||
| if err := op.Run(); err != nil { | ||
| return err | ||
| } | ||
|
mmatczuk marked this conversation as resolved.
|
||
|
|
||
| commits, err := llmfix.CherryPickCommits(dir, baseSHA) | ||
| if err != nil { | ||
| return fmt.Errorf("cherry-pick: %w", err) | ||
| } | ||
| for _, c := range commits { | ||
| logger.Printf("cherry-picked: %s", c) | ||
| } | ||
| return nil | ||
| } | ||
|
|
||
| func dumpTestOutput(outFile string, tests []CacheEntry) string { | ||
| var buf strings.Builder | ||
| for _, t := range tests { | ||
| if t.Result != ResultFail { | ||
| continue | ||
| } | ||
| buf.WriteString("---\n") | ||
| fmt.Fprintf(&buf, "Test: %s\n", t.TestName) | ||
| fmt.Fprintf(&buf, "Location: %s:%d\n\n", outFile, t.FailLine) | ||
|
|
||
| if t.FailLine > 0 { | ||
| var output bytes.Buffer | ||
| if err := showTestOutput(&output, outFile, t.FailLine); err != nil { | ||
| fmt.Fprintf(&buf, "(failed to extract output: %v)\n", err) | ||
| } else { | ||
| buf.WriteString(output.String()) | ||
| } | ||
| } | ||
| buf.WriteString("\n") | ||
| } | ||
| return buf.String() | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| // Copyright 2026 Redpanda Data, Inc. | ||
| // | ||
| // Licensed under the Apache License, Version 2.0 (the "License"); | ||
| // you may not use this file except in compliance with the License. | ||
| // You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, software | ||
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
||
| package main | ||
|
|
||
| import ( | ||
| "flag" | ||
| "strings" | ||
| ) | ||
|
|
||
| // splitFlagsAndArgs separates flag-like tokens from positional arguments so | ||
| // Go's flag package — which stops at the first non-flag token — can parse | ||
| // interspersed usage like: | ||
| // | ||
| // run --fix amqp1 --debug | ||
| // run --output-dir /tmp kafka | ||
| // | ||
| // It consults fset to tell bool flags (which never consume the next token) | ||
| // from value-taking flags (which do, unless already written as --flag=value). | ||
| func splitFlagsAndArgs(fset *flag.FlagSet, args []string) (flags, positional []string) { | ||
| for i := 0; i < len(args); i++ { | ||
| a := args[i] | ||
| if !strings.HasPrefix(a, "-") { | ||
| positional = append(positional, a) | ||
| continue | ||
| } | ||
| flags = append(flags, a) | ||
| if strings.Contains(a, "=") { | ||
| continue | ||
| } | ||
| name := strings.TrimLeft(a, "-") | ||
| f := fset.Lookup(name) | ||
| if f == nil { | ||
| continue | ||
| } | ||
| if bf, ok := f.Value.(interface{ IsBoolFlag() bool }); ok && bf.IsBoolFlag() { | ||
| continue | ||
| } | ||
| if i+1 < len(args) { | ||
| i++ | ||
| flags = append(flags, args[i]) | ||
| } | ||
| } | ||
| return flags, positional | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.