Skip to content

Comments

Experimental workflow to auto-fix tests#5014

Open
malithsen wants to merge 4 commits intodevelopfrom
try/auto-fix-tests
Open

Experimental workflow to auto-fix tests#5014
malithsen wants to merge 4 commits intodevelopfrom
try/auto-fix-tests

Conversation

@malithsen
Copy link
Contributor

@malithsen malithsen commented Feb 12, 2026

Changes proposed in this Pull Request:

This is an experiment to use Claude Code Action to automatically analyze test failures on PRs and open fix PRs when the failure is due to an issue in the test iteself (eg: #5005)

How it works:

  • Triggers automatically via workflow_run when PHP tests, JS tests, or E2E tests workflows fail on a PR
  • Step 1: Claude tries to classify the failure as test_drift, test_bug, application_bug, or environment using structured JSON output, then posts an analysis comment on the PR
  • Step 2: If classified as test_drift or test_bug with high/medium confidence, Claude creates a separate fix PR targeting the original PR's branch

Some guardrails:

  • Skips claude/ branches to avoid potential loops.
  • Skips if a fix PR already exists for the same source PR
  • Post-fix validation to ensure only test files were modified
  • Bash tools restricted to only the required gh and git commands

Testing instructions

Primarily looking for a code-review and feedback on the approach. We'd need to merge it to develop to actually test it.

I've tested a slightly modified version of this workflow in fork of the repo by opening a PR that introducing a failing PR.

PR: malithsen#2
Automated fix: malithsen#3

This test doesn't perfectly capture the intended scenario, but it does validate the end-to-end flow failure detection, analysis, comment posting, and fix PR creation.


  • Covered with tests (or have a good reason not to test in description ☝️)
  • Tested on mobile (or does not apply)

Changelog entry

  • This Pull Request does not require a changelog entry. (Comment required below)
Changelog Entry Comment

Comment

Post merge

@malithsen malithsen requested review from a team, daledupreez and wjrosa and removed request for a team February 12, 2026 22:45
Copy link
Contributor

@wjrosa wjrosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome idea! Looks good to me 👍

@malithsen malithsen added this to the 10.5.0 milestone Feb 13, 2026
Copy link
Contributor

@daledupreez daledupreez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good to me, and I think is totally worth trying!

I have some fairly minor comments and suggestions, but none of them are blocking.

Once we get things working, it may be worth splitting up the code into separate composable actions with clear inputs and outputs, as there is a lot going on across all the steps and common state.

# Step 2: Check for existing fix PR & extract logs
# ──────────────────────────────────────────────
- name: Check for existing fix PR
if: steps.resolve_pr.outputs.found == 'true' && steps.resolve_pr.outputs.is_fork != 'true'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Why not check that is_fork == 'false'? I think it makes what we are trying to find a bit clearer.

Suggested change
if: steps.resolve_pr.outputs.found == 'true' && steps.resolve_pr.outputs.is_fork != 'true'
if: steps.resolve_pr.outputs.found == 'true' && steps.resolve_pr.outputs.is_fork == 'false'

Comment on lines +134 to +135
const failedJobs = jobs.data.jobs.filter(j => j.conclusion === 'failure');
const failedJobNames = failedJobs.map(j => j.name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can we use job in the filters?

Suggested change
const failedJobs = jobs.data.jobs.filter(j => j.conclusion === 'failure');
const failedJobNames = failedJobs.map(j => j.name);
const failedJobs = jobs.data.jobs.filter(job => job.conclusion === 'failure');
const failedJobNames = failedJobs.map(job => job.name);

Comment on lines +137 to +139
// Download logs for each failed job (truncated to last 200 lines each, max 3 jobs)
let allLogs = '';
for (const job of failedJobs.slice(0, 3)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of interest, why only 3 jobs and why 200 lines?

});
const logLines = log.data.split('\n');
const truncated = logLines.slice(-200).join('\n');
allLogs += `\n--- Job: ${job.name} ---\n${truncated}\n`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it be worth adding more explicit separators between jobs?

Suggested change
allLogs += `\n--- Job: ${job.name} ---\n${truncated}\n`;
allLogs += `\n--- Job: ${job.name} ---\n${truncated}\n--- /end job ${job.name} ---\n`;

Comment on lines +165 to +176
steps.resolve_pr.outputs.is_fork != 'true' &&
steps.check_existing.outputs.exists != 'true'
uses: actions/checkout@v4
with:
ref: ${{ steps.resolve_pr.outputs.pr_branch }}
fetch-depth: 0

- name: Analyze test failure (Phase 1)
if: >
steps.resolve_pr.outputs.found == 'true' &&
steps.resolve_pr.outputs.is_fork != 'true' &&
steps.check_existing.outputs.exists != 'true'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted earlier, might it make sense to check for == 'false' rather than != 'true'?

Suggested change
steps.resolve_pr.outputs.is_fork != 'true' &&
steps.check_existing.outputs.exists != 'true'
uses: actions/checkout@v4
with:
ref: ${{ steps.resolve_pr.outputs.pr_branch }}
fetch-depth: 0
- name: Analyze test failure (Phase 1)
if: >
steps.resolve_pr.outputs.found == 'true' &&
steps.resolve_pr.outputs.is_fork != 'true' &&
steps.check_existing.outputs.exists != 'true'
steps.resolve_pr.outputs.is_fork == 'false' &&
steps.check_existing.outputs.exists == 'false'
uses: actions/checkout@v4
with:
ref: ${{ steps.resolve_pr.outputs.pr_branch }}
fetch-depth: 0
- name: Analyze test failure (Phase 1)
if: >
steps.resolve_pr.outputs.found == 'true' &&
steps.resolve_pr.outputs.is_fork == 'false' &&
steps.check_existing.outputs.exists == 'false'

## Run URL: ${{ steps.logs.outputs.run_url }}

## Error Logs:
${{ steps.logs.outputs.logs }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it be worth adding a ``` boundary here to explicitly wrap the log content in a markdown-native way?

Suggested change
${{ steps.logs.outputs.logs }}
```
${{ steps.logs.outputs.logs }}
```


if (analysis.affected_files && analysis.affected_files.length > 0) {
body += `**Affected files:**\n`;
analysis.affected_files.forEach(f => { body += `- \`${f}\`\n`; });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match the other code that is body-oriented, we could switch this around to use map() and join(). (Not at all blocking.)

Suggested change
analysis.affected_files.forEach(f => { body += `- \`${f}\`\n`; });
body += analysis.affected_files.map(filename => `- \`${f}\``).join( '\n' );


## Failed Workflow: ${{ steps.logs.outputs.workflow_name }}
## Error Logs:
${{ steps.logs.outputs.logs }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above RE wrapping the log output in ```.

Suggested change
${{ steps.logs.outputs.logs }}
```
${{ steps.logs.outputs.logs }}
```

Fixes failing tests from [${{ steps.logs.outputs.workflow_name }}](${{ steps.logs.outputs.run_url }}) on PR #${{ steps.resolve_pr.outputs.pr_number }}.

---
*Auto-generated by Claude Code*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention the workflow here?

Suggested change
*Auto-generated by Claude Code*"
*Auto-generated by Claude Code via the claude-fix-tests workflow*"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants