You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
test: add AI generated RC manual testing plan (#27492)
<!--
Please submit this PR as a draft initially.
Do not mark it as "Ready for review" until the template has been
completely filled out, and PR status checks have passed at least once.
-->
## **Description**
This PR adds an AI-powered release testing plan generator that:
- Analyzes release PRs – Fetches PR metadata, changed files, and team
sign-offs from GitHub
- Categorizes changes – Identifies high-impact files (app/, patches/,
manifests)
- Uses LLMs – Uses Claude (default), GPT-5, or Gemini with automatic
fallback
- Auto-detects feature flags – Excludes disabled features from test
scenarios
- Produces a structured plan – Outputs JSON test plan with scenarios,
steps, and risk levels
- Generates HTML viewer – Styled, readable test plan deployed to GitHub
Pages
-
The test plan is generated automatically when Bitrise posts "RC Builds
Ready for Testing" on release PRs. Each new build with cherry-picks can
trigger an updated plan.
Test Plan Output
Summary includes:
- releaseRiskScore (0–100, formula: min(100, round(10 * sqrt(highRisk *
4 + mediumRisk))))
- totalFilesChanged, highImpactFiles
- highRiskScenarios, mediumRiskScenarios counts
- teamsNeedingSignOff
Executive Summary includes:
- releaseFocus – One-line release description
- keyChanges – 3-5 bullet points
- overallRisk – low/medium/high
- recommendation – Go/no-go guidance
Scenario groups:
- initialScenarios – Risky areas from initial release commits
- cherryPickScenarios – Risky areas from cherry-pick commits
Each scenario includes:
- area – Feature area (e.g., "Card", "Swaps", "Send Flow")
- riskLevel – high/medium
- preconditions – Setup required before testing
- testSteps – 5-8 detailed, automation-ready steps
- expectedOutcomes – What success looks like
- whyThisMatters – References specific code changes
<!--
Write a short description of the changes included in this pull request,
also include relevant motivation and context. Have in mind the following
questions:
1. What is the reason for the change?
2. What is the improvement/solution?
-->
CI Workflow:
- .github/workflows/generate-rc-test-plan.yml – Triggered by Bitrise
comment, generates test plan, deploys to GitHub Pages
Test Plan Generation:
- modes/generate-test-plan/fast-analyzer.ts – Single-call LLM test plan
generation with delta/combined modes
- modes/generate-test-plan/handlers.ts – Agentic mode handlers (legacy)
- modes/generate-test-plan/prompt.ts – System prompts for test plan
generation
Utilities:
- utils/feature-flags.ts – Auto-detect disabled feature flags from
remote API
- utils/github-client.ts – GitHub API for PR info, team sign-offs, build
numbers
- utils/git-utils.ts – Cherry-pick detection between commits, commit
validation
Provider:
- Provider priority: Claude → OpenAI → Gemini
- Added usage tracking to Opus streaming responses
CI Changes
- New workflow triggers on issue_comment for release PRs
- Generates test-plan-{version}.json and test-plan-{version}.html
- Deploys to GitHub Pages:
metamask.github.io/metamask-mobile/test-plans/
- Updates Bitrise comment with test plan links
- Uses existing secrets: E2E_CLAUDE_API_KEY, E2E_OPENAI_API_KEY,
E2E_GEMINI_API_KEY
## **Changelog**
<!--
If this PR is not End-User-Facing and should not show up in the
CHANGELOG, you can choose to either:
1. Write `CHANGELOG entry: null`
2. Label with `no-changelog`
If this PR is End-User-Facing, please write a short User-Facing
description in the past tense like:
`CHANGELOG entry: Added a new tab for users to see their NFTs`
`CHANGELOG entry: Fixed a bug that was causing some NFTs to flicker`
(This helps the Release Engineer do their job more quickly and
accurately)
-->
CHANGELOG entry: null
## **Related issues**
Fixes:
https://consensyssoftware.atlassian.net/browse/INFRA-3426?actionerId=6126045c1827d1006848bec4&sourceType=assign&atlOrigin=eyJpIjoiNzQxOTQ5ZDQ4NjExNGI1ZjgzYWFjYTZhYzhhN2JmMzYiLCJwIjoiaiJ9
## **Manual testing steps**
# Export API key
export E2E_CLAUDE_API_KEY=sk-...
# Run locally against a release PR
node -r esbuild-register tests/tools/e2e-ai-analyzer \ --mode
generate-test-plan \ --pr 25900 \ --auto-ff
# Check output
cat release-test-plan.json
Verify:
- release-test-plan.json includes scenarios with riskLevel, testSteps,
whyThisMatters
- Executive summary has releaseFocus and recommendation
- Disabled feature flags are listed in excludedFeatures
```gherkin
Feature: my feature name
Scenario: user [verb for user action]
Given [describe expected initial app state]
When user [verb for user action]
Then [describe expected outcome]
```
## **Screenshots/Recordings**
<!-- If applicable, add screenshots and/or recordings to visualize the
before and after of your change. -->
https://github.com/user-attachments/assets/cbcdcf77-79be-469c-8056-9e1d85be2c36
### **Before**
<!-- [screenshots/recordings] -->
### **After**
<!-- [screenshots/recordings] -->
## **Pre-merge author checklist**
- [ ] I've followed [MetaMask Contributor
Docs](https://github.com/MetaMask/contributor-docs) and [MetaMask Mobile
Coding
Standards](https://github.com/MetaMask/metamask-mobile/blob/main/.github/guidelines/CODING_GUIDELINES.md).
- [ ] I've completed the PR template to the best of my ability
- [ ] I've included tests if applicable
- [ ] I've documented my code using [JSDoc](https://jsdoc.app/) format
if applicable
- [ ] I've applied the right labels on the PR (see [labeling
guidelines](https://github.com/MetaMask/metamask-mobile/blob/main/.github/guidelines/LABELING_GUIDELINES.md)).
Not required for external contributors.
## **Pre-merge reviewer checklist**
- [ ] I've manually tested the PR (e.g. pull and build branch, run the
app, test code being changed).
- [ ] I confirm that this PR addresses all acceptance criteria described
in the ticket it closes and includes the necessary testing evidence such
as recordings and or screenshots.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Adds a new GitHub Actions workflow that runs on PR comments, writes to
`gh-pages`, and posts back to PR comments; plus expands the
`e2e-ai-analyzer` to call external APIs/LLMs and `gh`/`git` commands,
which increases CI and release-pipeline surface area despite some input
sanitization.
>
> **Overview**
> Automates RC manual testing documentation by triggering a new workflow
(`generate-rc-test-plan.yml`) when Bitrise posts the "RC Builds Ready
for Testing" PR comment on `release/*` branches, running the
`tests/tools/e2e-ai-analyzer` to generate `release-test-plan.json`,
rendering an HTML viewer, publishing both to `gh-pages`, and appending
links back onto the originating comment.
>
> Extends `tests/tools/e2e-ai-analyzer` with a new `generate-test-plan`
mode (including new result types and finalize tool), a fast single-call
LLM path that pulls PR metadata/files/sign-offs via `gh`, optionally
computes cherry-pick deltas between commits/builds, and auto-excludes
disabled remote feature flags via a remote-config API call. It also
updates provider behavior (Claude-first failover and Anthropic Opus
streaming) and ignores generated release artifacts via `.gitignore`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
36764f6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
0 commit comments