Conversation
Rsdoctor Bundle Diff Analysis
Found 2 projects in monorepo, 0 projects with changes. 📊 Quick Summary
Generated by Rsdoctor GitHub Action |
There was a problem hiding this comment.
Pull request overview
Adds AI-powered degradation analysis to the Rsdoctor GitHub Action by generating bundle-diff JSON and sending it to an LLM (Anthropic/OpenAI), then appending the results to the PR comment.
Changes:
- Added
ai_modelaction input and plumbed AI token/model through the processing pipeline. - Implemented
src/ai-analysis.tsto build prompts, detect provider, and call Anthropic/OpenAI APIs. - Enhanced PR comment output with an expandable “AI Degradation Analysis” section per project.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| src/index.ts | Runs JSON bundle-diff generation, calls AI analysis, and appends AI results into PR comments. |
| src/ai-analysis.ts | New module to detect provider, build prompts, call LLM APIs, and return markdown analysis. |
| action.yml | Adds the new ai_model input for selecting the model. |
| dist/index.js | Compiled output updated to include AI analysis behavior. |
| const shellCmd = `npx @rsdoctor/cli bundle-diff --json --baseline="${baselineJsonPath}" --current="${fullPath}"`; | ||
| console.log(`🛠️ Running rsdoctor --json via npx: ${shellCmd}`); | ||
| await execFileAsync('sh', ['-c', shellCmd], { cwd: tempOutDir }); |
There was a problem hiding this comment.
Using sh -c with an interpolated command string built from file paths can enable shell injection if baselineJsonPath or fullPath contains shell metacharacters (these paths can be PR-controlled in a GitHub Action). Prefer calling execFileAsync without a shell by passing npx (or node) and arguments as an array, which avoids quoting/escaping issues and is safer.
| const shellCmd = `npx @rsdoctor/cli bundle-diff --json --baseline="${baselineJsonPath}" --current="${fullPath}"`; | |
| console.log(`🛠️ Running rsdoctor --json via npx: ${shellCmd}`); | |
| await execFileAsync('sh', ['-c', shellCmd], { cwd: tempOutDir }); | |
| const npxArgs = [ | |
| '@rsdoctor/cli', | |
| 'bundle-diff', | |
| '--json', | |
| '--baseline', | |
| baselineJsonPath, | |
| '--current', | |
| fullPath, | |
| ]; | |
| console.log(`🛠️ Running rsdoctor --json via npx with args: ${JSON.stringify(npxArgs)}`); | |
| await execFileAsync('npx', npxArgs, { cwd: tempOutDir }); |
| function buildPrompt(diffData: unknown): string { | ||
| // Truncate large diff data to avoid token limits (~50k chars) | ||
| const MAX_CHARS = 50000; | ||
| let diffStr = JSON.stringify(diffData, null, 2); | ||
| if (diffStr.length > MAX_CHARS) { | ||
| diffStr = diffStr.substring(0, MAX_CHARS) + '\n... (truncated due to size)'; | ||
| } |
There was a problem hiding this comment.
The truncation happens after JSON.stringify, which still requires serializing the entire diff into a string first. For very large diff JSONs this can significantly increase CPU/memory and may lead to OOM in CI. Consider truncating earlier (e.g., read the file as text and cap by bytes/chars before parsing/pretty-printing, or extract/summarize only the relevant top-level keys before stringifying).
| function buildPrompt(diffData: unknown): string { | |
| // Truncate large diff data to avoid token limits (~50k chars) | |
| const MAX_CHARS = 50000; | |
| let diffStr = JSON.stringify(diffData, null, 2); | |
| if (diffStr.length > MAX_CHARS) { | |
| diffStr = diffStr.substring(0, MAX_CHARS) + '\n... (truncated due to size)'; | |
| } | |
| function summarizeDiffData(diffData: unknown, maxChars: number): string { | |
| // If it's already a string, truncate directly without additional serialization | |
| if (typeof diffData === 'string') { | |
| return diffData.length > maxChars | |
| ? diffData.substring(0, maxChars) + '\n... (truncated due to size)' | |
| : diffData; | |
| } | |
| // For arrays, only serialize a subset of items | |
| if (Array.isArray(diffData)) { | |
| const MAX_ITEMS = 50; | |
| const sliced = diffData.slice(0, MAX_ITEMS); | |
| let result = JSON.stringify(sliced, null, 2); | |
| if (diffData.length > MAX_ITEMS) { | |
| result += `\n... (${diffData.length - MAX_ITEMS} more items truncated)`; | |
| } | |
| if (result.length > maxChars) { | |
| result = result.substring(0, maxChars) + '\n... (truncated due to size)'; | |
| } | |
| return result; | |
| } | |
| // For plain objects, only include a subset of top-level keys | |
| if (diffData && typeof diffData === 'object') { | |
| const obj = diffData as Record<string, unknown>; | |
| const keys = Object.keys(obj); | |
| const MAX_KEYS = 50; | |
| const limited: Record<string, unknown> = {}; | |
| for (const key of keys.slice(0, MAX_KEYS)) { | |
| limited[key] = obj[key]; | |
| } | |
| let result = JSON.stringify(limited, null, 2); | |
| if (keys.length > MAX_KEYS) { | |
| result += `\n... (${keys.length - MAX_KEYS} more keys truncated)`; | |
| } | |
| if (result.length > maxChars) { | |
| result = result.substring(0, maxChars) + '\n... (truncated due to size)'; | |
| } | |
| return result; | |
| } | |
| // Fallback: coerce to string and truncate if needed | |
| const coerced = String(diffData); | |
| return coerced.length > maxChars | |
| ? coerced.substring(0, maxChars) + '\n... (truncated due to size)' | |
| : coerced; | |
| } | |
| function buildPrompt(diffData: unknown): string { | |
| // Truncate large diff data to avoid token limits (~50k chars) | |
| const MAX_CHARS = 50000; | |
| const diffStr = summarizeDiffData(diffData, MAX_CHARS); |
| } | ||
|
|
||
| function detectProvider(model: string): 'anthropic' | 'openai' { | ||
| return model.toLowerCase().startsWith('claude') ? 'anthropic' : 'openai'; |
There was a problem hiding this comment.
Provider detection is too narrow: model identifiers like anthropic/claude-... or other Anthropic-prefixed forms won’t match startsWith('claude') and will be incorrectly routed to OpenAI. Recommend expanding detection (e.g., also accept anthropic/ prefixes and claude appearing after a namespace) and/or validating the model string and returning a clear error for unknown formats.
| return model.toLowerCase().startsWith('claude') ? 'anthropic' : 'openai'; | |
| const lower = model.toLowerCase(); | |
| // Treat bare "claude-..." and namespaced "anthropic/claude-..." (or similar) as Anthropic | |
| if (lower.startsWith('claude') || lower.includes('/claude')) { | |
| return 'anthropic'; | |
| } | |
| // Optionally recognize common OpenAI patterns explicitly; default remains OpenAI | |
| return 'openai'; |
| } | ||
|
|
||
| try { | ||
| const diffData: unknown = JSON.parse(fs.readFileSync(diffJsonPath, 'utf8')); |
There was a problem hiding this comment.
analyzeWithAI is async but uses readFileSync, which blocks the Node.js event loop. Switching to await fs.promises.readFile(...) keeps the action responsive and avoids blocking I/O (especially if multiple projects are processed).
| const diffData: unknown = JSON.parse(fs.readFileSync(diffJsonPath, 'utf8')); | |
| const diffData: unknown = JSON.parse(await fs.promises.readFile(diffJsonPath, 'utf8')); |
| commentBody += report.aiAnalysis.analysis + '\n\n'; | ||
| commentBody += `<sub>Analysis by ${report.aiAnalysis.model}</sub>\n\n`; |
There was a problem hiding this comment.
AI output is injected directly into the PR comment body as GitHub-flavored Markdown. This can unintentionally trigger @mentions, issue/PR links, or other noisy formatting. Consider neutralizing mentions (e.g., replacing @ with @\\u200b), or constraining the rendering (e.g., wrapping the AI response in a blockquote or details section that discourages mention expansion) before posting.
| ai_model: | ||
| description: 'AI model to use for degradation analysis (e.g. claude-3-5-haiku-latest, gpt-4o-mini). Provider is auto-detected from the model name prefix.' | ||
| required: false | ||
| default: 'claude-3-5-haiku-latest' |
There was a problem hiding this comment.
Other inputs in this file specify type: string, but ai_model does not. For consistency (and clearer metadata for tooling), add type: string to the new ai_model input.
| default: 'claude-3-5-haiku-latest' | |
| default: 'claude-3-5-haiku-latest' | |
| type: string |
| const aiToken = process.env.AI_TOKEN || ''; | ||
| const aiModel = getInput('ai_model') || 'claude-3-5-haiku-latest'; |
There was a problem hiding this comment.
The action introduces ai_model as an input, but the token is only configurable via process.env.AI_TOKEN (not an action input) and isn’t documented in action.yml in this diff. Consider adding an ai_token input (and calling core.setSecret on it) or explicitly documenting the required env var in action.yml so users can discover/configure it reliably.
|
That's a very interesting feature 👀 Do you plan to also make it ❌ on the PR? e.g. if some condition / based on some rules aren't satisfied (e.g; bundle size regression) could it make Github PR Status go fail? |
@gre Before, I was afraid that it was not very stable at present, and I was afraid of misjudgment and interference, which would increase the threshold ability later. |
This pull request introduces an AI-powered degradation analysis feature to the Rsdoctor GitHub Action, enabling automated bundle size regression reports using Anthropic or OpenAI models. The main changes add a new input for selecting the AI model, implement logic to generate and analyze bundle diffs with AI, and update the PR comment to include the AI's findings.
AI analysis integration:
ai_modelinput toaction.ymlto allow users to specify which AI model to use for degradation analysis; the provider is auto-detected from the model name.src/ai-analysis.ts, which detects the provider, builds prompts, and calls Anthropic or OpenAI APIs to analyze bundle diff JSONs and return concise markdown reports.src/index.tsto:ProjectReporttype to include AI analysis results. [1] [2]aiTokenandaiModelthrough the processing pipeline and generate bundle diff JSONs for AI analysis per project, handling both direct and npx-based CLI invocation. [1] [2] [3] [4]PR comment/report enhancements: