Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
7ed0393
docs: add slack-triggered e2e triage design
alishakawaguchi Mar 17, 2026
7acb6cd
ci: add structured metadata to e2e slack alerts
alishakawaguchi Mar 17, 2026
c06183f
ci: add e2e triage runner script
alishakawaguchi Mar 17, 2026
0f8a835
ci: add e2e triage workflow
alishakawaguchi Mar 17, 2026
16bd351
ci: harden e2e triage dispatch validation
alishakawaguchi Mar 17, 2026
9666ead
ci: add slack replies to e2e triage workflow
alishakawaguchi Mar 17, 2026
f8a13d9
ci: fix slack triage start notification env
alishakawaguchi Mar 17, 2026
b919472
ci: dedupe e2e triage slack notifications
alishakawaguchi Mar 17, 2026
04541ff
feat: add slack triage parsing helpers
alishakawaguchi Mar 17, 2026
5f0adc9
feat: add slack app for e2e triage dispatch
alishakawaguchi Mar 17, 2026
2d278f6
docs: add slack-triggered e2e triage runbook
alishakawaguchi Mar 17, 2026
575b6b7
chore: fix slack triage lint findings
alishakawaguchi Mar 17, 2026
21edabe
docs: fix slack triage documentation links
alishakawaguchi Mar 17, 2026
6080e9b
ci: auto-detect sha and failed_agents from run URL in e2e triage
alishakawaguchi Mar 20, 2026
8452c5b
chore: simplify e2e triage workflow and fix lint issues
alishakawaguchi Mar 20, 2026
f8a82d6
ci: add push trigger for testing e2e triage on feature branch
alishakawaguchi Mar 20, 2026
5a20065
Revert "ci: add push trigger for testing e2e triage on feature branch"
alishakawaguchi Mar 20, 2026
8c4586c
fix: checkout workflow branch instead of target SHA in e2e triage
alishakawaguchi Mar 20, 2026
379aa6e
fix: add strict tool permissions for claude in CI triage
alishakawaguchi Mar 20, 2026
9e64c52
fix: pre-download artifacts and restrict claude to read-only tools
alishakawaguchi Mar 20, 2026
ba6611a
fix: strip ANSI escape codes from triage output for clean GitHub summ…
alishakawaguchi Mar 20, 2026
19cbefe
feat: replace dispatch service with Cloudflare Worker for one-click S…
alishakawaguchi Mar 20, 2026
fa9e760
refactor: move Cloudflare Worker to infra repo
alishakawaguchi Mar 20, 2026
22df676
fix: harden jq agent extraction and add concurrency guard to triage w…
alishakawaguchi Mar 20, 2026
e2288ad
fix: stop dumping raw markdown into step log, direct to job summary
alishakawaguchi Mar 20, 2026
63396e2
fix: rename triage artifact from .log to .md for proper rendering
alishakawaguchi Mar 20, 2026
1544f18
feat: add plan generation and fix pipeline to e2e triage CI
alishakawaguchi Mar 21, 2026
eb354bd
fix: create plan output directory before writing plan.md
alishakawaguchi Mar 21, 2026
e093f7e
fix: guard plan extraction against empty execution_file output
alishakawaguchi Mar 21, 2026
3eca00a
fix: restore Slack webhook notification and remove unused docs
alishakawaguchi Mar 21, 2026
8551955
fix: address code review — revert e2e.yml, add rerun toggle, update t…
alishakawaguchi Mar 21, 2026
9ddf6cc
fix: extract shared Slack helper script and fix shell injection
alishakawaguchi Mar 23, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .claude/skills/e2e/implement.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

Apply fixes for E2E test failures, verify with scoped E2E tests.

> **Before implementing any fixes, enter plan mode by invoking /plan.**
> Analyze the findings (Steps 1-2 below), produce a complete fix plan with
> specific file paths and code changes, and get user approval before executing.
> **IMPORTANT: Running real E2E tests is a HARD REQUIREMENT of this procedure.**
> Every fix MUST be verified with real E2E tests before the summary step.
> Canary tests use the Vogon fake agent and cannot catch agent-specific issues.
Expand Down
140 changes: 140 additions & 0 deletions .github/workflows/e2e-fix.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
name: E2E Fix

on:
workflow_dispatch:
inputs:
triage_run_id:
description: Run ID of the triage workflow (for downloading plan artifacts)
required: true
type: string
run_url:
description: Original failed E2E run URL
required: true
type: string
failed_agents:
description: Comma-separated list of agents to fix
required: true
type: string
slack_channel:
description: Slack channel ID for thread replies
required: false
type: string
slack_thread_ts:
description: Slack thread timestamp for replies
required: false
type: string

permissions:
actions: read
contents: write
pull-requests: write
id-token: write

concurrency:
group: e2e-fix-${{ inputs.run_url || github.run_id }}
cancel-in-progress: true

jobs:
fix:
runs-on: ubuntu-latest
timeout-minutes: 30
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
SLACK_CHANNEL: ${{ inputs.slack_channel }}
SLACK_THREAD_TS: ${{ inputs.slack_thread_ts }}
steps:
- name: Checkout repository
uses: actions/checkout@v6
with:
fetch-depth: 0

- name: Post fix started
if: ${{ env.SLACK_BOT_TOKEN != '' && env.SLACK_CHANNEL != '' && env.SLACK_THREAD_TS != '' }}
shell: bash
env:
FAILED_AGENTS: ${{ inputs.failed_agents }}
run: |
set -euo pipefail

scripts/post-slack-message.sh "Starting E2E fix for \`${FAILED_AGENTS}\`."

- name: Setup mise
uses: jdx/mise-action@v4

- name: Download plan artifacts
env:
GH_TOKEN: ${{ github.token }}
TRIAGE_RUN_ID: ${{ inputs.triage_run_id }}
FAILED_AGENTS: ${{ inputs.failed_agents }}
shell: bash
run: |
set -euo pipefail

mkdir -p triage-plans

IFS=',' read -ra agents <<< "$FAILED_AGENTS"
for agent in "${agents[@]}"; do
agent="$(echo "$agent" | xargs)" # trim whitespace
echo "Downloading plan for $agent..."
gh run download "$TRIAGE_RUN_ID" \
--name "e2e-plan-${agent}" \
--dir "triage-plans/${agent}" || {
echo "warning: no plan artifact found for $agent" >&2
continue
}
done

echo "Downloaded plans:"
find triage-plans -name '*.md' -type f

- name: Apply fixes
id: fix
uses: anthropics/claude-code-action@v1
with:
prompt: |
Read the fix plans in the triage-plans/ directory. Each subdirectory contains a plan.md for one agent.

Execute all fixes exactly as specified in the plans. After applying fixes, run:
1. mise run fmt
2. mise run lint
3. mise run test:e2e:canary

If verification passes, create a git branch fix/e2e-${{ github.run_id }}, commit all changes,
push, and create a draft PR with a summary of what was fixed.

If verification fails, fix the issues and retry. Do not give up without attempting to fix lint/format errors.
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
claude_args: "--allowedTools 'Edit,Write,Read,Glob,Grep,Bash(git:*),Bash(mise:*),Bash(gh:*)'"

- name: Post success to Slack
if: success() && env.SLACK_BOT_TOKEN != '' && env.SLACK_CHANNEL != '' && env.SLACK_THREAD_TS != ''
shell: bash
env:
GH_TOKEN: ${{ github.token }}
FIX_BRANCH: fix/e2e-${{ github.run_id }}
RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
set -euo pipefail

# Find the draft PR URL from the fix step output
pr_url="$(gh pr list --head "$FIX_BRANCH" --json url -q '.[0].url' 2>/dev/null || true)"

if [ -n "$pr_url" ]; then
message="E2E fix complete — draft PR ready: <${pr_url}|Review PR>"
else
message="E2E fix complete — changes applied but no PR was created. Check the <${RUN_URL}|workflow run> for details."
fi

scripts/post-slack-message.sh "$message"

- name: Post failure to Slack
if: failure() && env.SLACK_BOT_TOKEN != '' && env.SLACK_CHANNEL != '' && env.SLACK_THREAD_TS != ''
shell: bash
env:
RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
set -euo pipefail

message="E2E fix failed. Check the <${RUN_URL}|workflow run> for details."

scripts/post-slack-message.sh "$message"
Loading
Loading