Skip to content

fix(eval): prevent RangeError on large eval table payloads#7716

Open
mldangelo wants to merge 8 commits intomainfrom
fix/eval-table-trim-payload
Open

fix(eval): prevent RangeError on large eval table payloads#7716
mldangelo wants to merge 8 commits intomainfrom
fix/eval-table-trim-payload

Conversation

@mldangelo
Copy link
Member

@mldangelo mldangelo commented Feb 16, 2026

Summary

  • Proactively strip redundant/large fields from eval table cells to prevent RangeError when JSON.stringify exceeds V8's ~512MB limit (e.g. base64 images duplicated across every cell)
  • Add trimTableCellForApi() utility that strips the rendered prompt, ...result spread fields (evalId, promptIdx, testIdx, etc.), and trims response to only cached/tokenUsage/prompt
  • Add GET /api/eval/:evalId/results/:resultId/detail endpoint so the frontend can fetch full cell data on demand
  • Frontend lazy-loads prompt content when user opens the detail dialog instead of receiving it upfront
  • Strip config.tests from table response (unused by frontend, potentially large)
  • Avoid loading all results in database.ts just to check config (toResultsFile(){ config })
  • Catch RangeError in export command with helpful message suggesting -o flag

Supersedes #7599 — addresses the root cause (payload bloat) rather than only catching the error.

Test plan

  • Unit tests for trimTableCellForApi (strips prompt, spread fields, trims response, preserves essentials, handles edge cases)
  • Server integration tests for trimmed table response and stripped config.tests
  • Server tests for detail endpoint (success, 404 for missing result, 404 for wrong eval)
  • Verified JSON/CSV export endpoints still return full unstripped data
  • Browser QA: table renders correctly, detail dialog lazy-loads prompt, pass/fail/scores display
  • Test with large eval containing base64 images (the original crashing scenario)

🤖 Generated with Claude Code

mldangelo and others added 4 commits February 15, 2026 10:23
Strip large/redundant fields from table cell payloads to prevent
RangeError crashes from JSON.stringify on evals with base64 images.

- Add trimTableCellForApi() that strips: rendered prompt (set to ''),
  response (keep only cached/tokenUsage/prompt), testCase (keep only
  provider), and unnecessary spread fields from EvalResult
- Apply trimming in GET /:id/table before building the response
- Strip config.tests from table response (unused by frontend)
- Add GET /:evalId/results/:resultId/detail endpoint for on-demand
  fetching of full prompt, response, and testCase data

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The table endpoint now strips per-cell prompt content to reduce payload
size. Update the frontend to fetch full prompt data on demand when the
user opens the prompt dialog.

- Add fetchCellDetail() API function
- Add lazy-loading state in EvalOutputCell with on-demand fetch
- Always show prompt dialog button (prompt data loads on click)
- Pass testVars prop from ResultsTable for variable display
- Use output.text as fallback for image alt text

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
getPromptsWithPredicate and getTestCasesWithPredicate called
eval_.toResultsFile() which loads all results into memory, but only
config was ever accessed. Use { config: eval_.config } directly.

Also catch RangeError in CLI export when outputting large evals to
console, showing the exact command to export to a file instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Test that table cells have prompt stripped, spread fields removed,
  and response trimmed to essential fields only
- Test that config.tests is stripped from table response
- Test detail endpoint returns full prompt/response/testCase
- Test detail endpoint returns 404 for missing or wrong-eval results
- Unit tests for trimTableCellForApi covering all field handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@promptfoo-scanner promptfoo-scanner bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 All Clear

I reviewed this PR for LLM security vulnerabilities focusing on the six critical vulnerability classes (Prompt Injection, Data Exfiltration, PII/Secrets in Prompts, Insecure Output Handling, Excessive Agency, and Jailbreak Risks). This PR is a performance optimization that reduces API payload sizes and adds lazy-loading for evaluation results. No LLM security vulnerabilities were identified.

Minimum severity threshold: 🟡 Medium | To re-scan after changes, comment @promptfoo-scanner
Learn more


Was this helpful?  👍 Yes  |  👎 No 

@mldangelo mldangelo changed the title perf(eval): trim redundant data from table endpoint fix(eval): prevent RangeError on large eval table payloads Feb 16, 2026
mldangelo and others added 4 commits February 16, 2026 12:11
1. Remove eval ownership check from detail endpoint — in comparison
   mode the frontend passes the base eval ID but comparison results
   belong to a different eval, causing 404. Result IDs are unique UUIDs.

2. Reset cellDetail state when output changes to prevent stale prompt
   content when React reuses a component instance for a different row.

3. Auto-fetch prompt when "Show Prompts" is toggled on and prompt was
   stripped, so inline prompts appear without clicking each cell.

4. Replace `{} as any` with `{} as AtomicTestCase` for proper typing
   (AtomicTestCase has all optional fields so `{}` is valid).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Restore eval ownership check in detail endpoint. Pass evalId through
   trimmed cells (from ...result spread) so the frontend uses the correct
   evalId for comparison-mode cells instead of the base eval ID.

2. Strip response.prompt from trimmed cells — for multimodal providers
   this can contain base64 images duplicated across every cell. The
   providerPrompt in the dialog now reads from cellDetail?.response.

3. Add stale-request cancellation to the showPrompts auto-fetch effect
   using the cleanup pattern (let cancelled = false; return () => ...).
   Prevents older responses from winning races during rapid updates.

4. Update tests: restore wrong-eval 404, verify evalId is preserved,
   verify response.prompt is stripped.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TypeScript's strict mode rejects direct cast from a typed interface to
Record<string, unknown>. Cast through `unknown` first.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The providerPrompt now reads from cellDetail?.response but the state
type was missing the response field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@use-tusk
Copy link
Contributor

use-tusk bot commented Feb 16, 2026

✅ Generated 14 tests - 14 passed (e4c52cb) View tests ↗

Test Summary

  • EvalOutputCell (cellDetail state reset) - 1 ✅
  • EvalOutputCell (cellEvalId, detailEvalId logic) - 2 ✅
  • EvalOutputCell (dialog prompt content) - 1 ✅
  • EvalOutputCell (fetchCellDetail null handling) - 1 ✅
  • EvalOutputCell (lazy-loading cell details) - 1 ✅
  • EvalOutputCell (loading state) - 1 ✅
  • EvalOutputCell (prompt presence check) - 1 ✅
  • EvalOutputCell (stale request cancellation) - 1 ✅
  • fetchCellDetail - 5 ✅

Results

Tusk's tests are all passing and validate the core changes in this PR: lazy-loading of cell details via the new GET /api/eval/:evalId/results/:resultId/detail endpoint and frontend integration. The EvalOutputCell tests confirm that the component correctly fetches trimmed table payloads on demand, handles the cellEvalId vs page-level evaluationId logic (critical for comparison mode), and gracefully manages edge cases like missing data or rapid output changes. The fetchCellDetail API tests verify the new endpoint integration works end-to-end, including error handling for network failures and non-ok responses. Together, these tests validate that the fix prevents RangeError on large payloads by stripping redundant fields upfront and deferring full data retrieval until the user opens the detail dialog—addressing the root cause rather than just catching the error.

📈 Coverage gains

Line coverage - avg 11% gain for 2 files
Source file Original After Tusk Gain
src/app/src/pages/eval/components/EvalOutputCell.tsx 79.84% 82.17% +2.33%
src/app/src/utils/api.ts 16.12% 35.48% +19.36%

Coverage is calculated by running tests directly associated with each source file, learn more here.

Branch coverage - avg 9% gain for 2 files
Source file Original After Tusk Gain
src/app/src/pages/eval/components/EvalOutputCell.tsx 73.33% 75.07% +1.74%
src/app/src/utils/api.ts 38.46% 53.84% +15.38%

Coverage is calculated by running tests directly associated with each source file, learn more here.


View check history

Commit Status Output Created (UTC)
1b405ec ⏩ No tests generated Output Feb 16, 2026 4:35PM
041f902 ⏩ No tests generated Output Feb 16, 2026 5:11PM
1792078 ⏩ Skipped due to new commit on branch Output Feb 16, 2026 5:26PM
4d72f72 ⏩ No tests generated Output Feb 16, 2026 5:39PM
e4c52cb ✅ Generated 14 tests - 14 passed Tests Feb 16, 2026 5:55PM

Was Tusk helpful? Give feedback by reacting with 👍 or 👎

@mldangelo mldangelo marked this pull request as ready for review February 16, 2026 19:02
@github-actions
Copy link
Contributor

Security Review ✅

No critical issues found. The changes properly validate inputs via Zod schemas, use parameterized Drizzle ORM queries (no SQL injection risk), and enforce eval ownership on the new detail endpoint to prevent IDOR.

🟡 Minor Observations (3 items)
  • src/server/routes/eval.ts:363-378 - The new /:evalId/results/:resultId/detail endpoint lacks a try/catch around EvalSchemas.ResultDetail.Params.parse(). If Zod validation fails (though unlikely given Express route params), it will throw an unhandled error resulting in a 500 instead of a 400. Other endpoints in this file (e.g., /:id/metadata-keys) catch ZodError and return structured 400 responses.
  • src/app/src/pages/eval/components/EvalOutputCell.tsx:186 - The fetchCellDetail promise in the auto-fetch useEffect has no .catch() handler. While fetchCellDetail itself catches errors internally, if it were ever refactored to throw, the effect would produce an unhandled promise rejection. The handlePromptOpen path uses await which would be caught by React's error boundary, but the effect path would not.
  • src/app/src/pages/eval/components/EvalOutputCell.tsx:70 - The cellEvalId extraction uses (output as unknown as Record<string, unknown>).evalId — a double type assertion to access a runtime property not in the EvaluateTableOutput type. This works but is fragile; consider extending the type or using a typed helper if this pattern spreads further.

Last updated: 2026-02-16 | Reviewing: e4c52cb

Copy link
Contributor

@promptfoo-scanner promptfoo-scanner bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 All Clear

This PR introduces lazy loading for evaluation result data to optimize payload sizes. The security review found no LLM-specific vulnerabilities - the changes are purely infrastructure-level optimizations for data retrieval and display that don't affect LLM interactions.

Minimum severity threshold: 🟡 Medium | To re-scan after changes, comment @promptfoo-scanner
Learn more


Was this helpful?  👍 Yes  |  👎 No 

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 16, 2026

📝 Walkthrough

Walkthrough

The PR introduces lazy-loaded cell detail retrieval to reduce API payload sizes. A new GET endpoint returns full prompt, response, and testCase details for specific result cells on demand. The table API response is trimmed via a new trimTableCellForApi utility, which strips prompts and large fields while preserving essential metadata. The EvalOutputCell component now accepts per-cell test variables via a new testVars prop and fetches full details asynchronously when prompts are displayed. Console export errors resulting from oversized JSON payloads are caught and a guidance message is provided. Comprehensive tests validate the trimming behavior and new detail endpoint functionality.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

  • RangeError crash on GET /eval/:id/table for large evals #7649: The changes directly address the RangeError crash when exporting large evaluation results by introducing trimTableCellForApi to strip oversized prompt/response content from API responses, and adding error handling in export.ts to gracefully handle console output failures with guidance to export to file.
🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (18 files):

⚔️ biome.jsonc (content)
⚔️ code-scan-action/package-lock.json (content)
⚔️ code-scan-action/package.json (content)
⚔️ package-lock.json (content)
⚔️ package.json (content)
⚔️ renovate.json (content)
⚔️ site/package.json (content)
⚔️ src/app/package.json (content)
⚔️ src/app/src/pages/eval/components/EvalOutputCell.tsx (content)
⚔️ src/app/src/pages/eval/components/ResultsTable.tsx (content)
⚔️ src/app/src/utils/api.ts (content)
⚔️ src/commands/export.ts (content)
⚔️ src/server/routes/eval.ts (content)
⚔️ src/types/api/eval.ts (content)
⚔️ src/util/database.ts (content)
⚔️ src/util/exportToFile/index.ts (content)
⚔️ test/server/eval.test.ts (content)
⚔️ test/util/exportToFile/index.test.ts (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: preventing RangeError on large eval table payloads through field trimming optimization.
Description check ✅ Passed The description is well-structured and directly related to the changeset, covering the motivation, implementation approach, testing, and supersession of a prior issue.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/eval-table-trim-payload
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch fix/eval-table-trim-payload
  • Create stacked PR with resolved conflicts
  • Post resolved changes as copyable diffs in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/app/src/pages/eval/components/EvalOutputCell.tsx`:
- Around line 186-211: The useEffect that auto-fetches prompt details
(React.useEffect) can leave loadingDetail stuck because the cleanup only sets
cancelled=true and the async .then never runs setLoadingDetail(false); update
the effect to use async/await inside an inner async function called (e.g.,
fetchDetailAsync) that awaits fetchCellDetail(detailEvalId, output.id), and in
the cleanup always call setLoadingDetail(false) as well as flip the cancelled
flag; ensure you still only call setCellDetail(detail) when not cancelled and
that the effect dependencies remain [showPrompts, cellDetail, loadingDetail,
output.prompt, output.id, detailEvalId].

In `@src/server/routes/eval.ts`:
- Around line 353-372: Wrap the async route handler for
evalRouter.get('/:evalId/results/:resultId/detail') in a try/catch; validate
params using EvalSchemas.ResultDetail.Params.parse as currently done, but if
parse throws return res.status(400).json({ success: false, error: '<validation
message>' }); after DB lookup (EvalResult.findById) return 404 via
res.status(404).json({ success: false, error: 'Result not found' }) when missing
or mismatched, and on success return res.json({ success: true, data: { prompt:
result.prompt.raw, response: result.response, testCase: result.testCase } });
catch any other errors and return res.status(500).json({ success: false, error:
String(err) }) to ensure all responses follow the {success, data/error} contract
and Zod/DB errors are handled.

Comment on lines +186 to +211
// Auto-fetch prompt when "Show Prompts" is toggled on and prompt was stripped.
// Uses cleanup function to cancel stale responses during rapid cell/toggle changes.
React.useEffect(() => {
if (
showPrompts &&
!cellDetail &&
!loadingDetail &&
!output.prompt &&
output.id &&
detailEvalId
) {
let cancelled = false;
setLoadingDetail(true);
fetchCellDetail(detailEvalId, output.id).then((detail) => {
if (!cancelled) {
if (detail) {
setCellDetail(detail);
}
setLoadingDetail(false);
}
});
return () => {
cancelled = true;
};
}
}, [showPrompts, cellDetail, loadingDetail, output.prompt, output.id, detailEvalId]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and read the specific lines
wc -l src/app/src/pages/eval/components/EvalOutputCell.tsx

Repository: promptfoo/promptfoo

Length of output: 120


🏁 Script executed:

# Read the context around lines 186-211
sed -n '180,220p' src/app/src/pages/eval/components/EvalOutputCell.tsx

Repository: promptfoo/promptfoo

Length of output: 1624


🏁 Script executed:

# Find the fetchCellDetail function to understand its signature
rg -n "fetchCellDetail" src/app/src/pages/eval/components/EvalOutputCell.tsx -A 2 -B 2

Repository: promptfoo/promptfoo

Length of output: 714


🏁 Script executed:

# Search for fetchCellDetail definition in the codebase
rg -n "const fetchCellDetail|function fetchCellDetail|export.*fetchCellDetail" --type ts --type tsx

Repository: promptfoo/promptfoo

Length of output: 90


🏁 Script executed:

# Search for fetchCellDetail definition more broadly
rg "export.*fetchCellDetail|const fetchCellDetail\s*=" --type ts -A 5

Repository: promptfoo/promptfoo

Length of output: 419


🏁 Script executed:

# Also check the api utils file directly
fd -t f "api.ts" src/app/src/utils/

Repository: promptfoo/promptfoo

Length of output: 87


🏁 Script executed:

# Read the api utility file
cat -n src/app/src/utils/api.ts | head -150

Repository: promptfoo/promptfoo

Length of output: 3063


🏁 Script executed:

# Search for fetchCellDetail in api.ts
rg -n "fetchCellDetail" src/app/src/utils/api.ts -A 10 -B 2

Repository: promptfoo/promptfoo

Length of output: 400


Reset loadingDetail in cleanup to prevent stuck loading state.

When the effect cleanup runs before the async request resolves (e.g., showPrompts toggles off), the if (!cancelled) guard prevents setLoadingDetail(false) from executing. This leaves the dialog in a perpetual loading state and blocks subsequent fetch attempts. Reset loadingDetail in the cleanup function and use async/await for consistency with coding guidelines.

🛠️ Suggested fix
 React.useEffect(() => {
-  if (
-    showPrompts &&
-    !cellDetail &&
-    !loadingDetail &&
-    !output.prompt &&
-    output.id &&
-    detailEvalId
-  ) {
-    let cancelled = false;
-    setLoadingDetail(true);
-    fetchCellDetail(detailEvalId, output.id).then((detail) => {
-      if (!cancelled) {
-        if (detail) {
-          setCellDetail(detail);
-        }
-        setLoadingDetail(false);
-      }
-    });
-    return () => {
-      cancelled = true;
-    };
-  }
+  if (
+    !showPrompts ||
+    cellDetail ||
+    loadingDetail ||
+    output.prompt ||
+    !output.id ||
+    !detailEvalId
+  ) {
+    return;
+  }
+
+  let cancelled = false;
+  const loadDetail = async () => {
+    setLoadingDetail(true);
+    try {
+      const detail = await fetchCellDetail(detailEvalId, output.id);
+      if (!cancelled && detail) {
+        setCellDetail(detail);
+      }
+    } finally {
+      if (!cancelled) {
+        setLoadingDetail(false);
+      }
+    }
+  };
+  loadDetail();
+
+  return () => {
+    cancelled = true;
+    setLoadingDetail(false);
+  };
 }, [showPrompts, cellDetail, loadingDetail, output.prompt, output.id, detailEvalId]);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Auto-fetch prompt when "Show Prompts" is toggled on and prompt was stripped.
// Uses cleanup function to cancel stale responses during rapid cell/toggle changes.
React.useEffect(() => {
if (
showPrompts &&
!cellDetail &&
!loadingDetail &&
!output.prompt &&
output.id &&
detailEvalId
) {
let cancelled = false;
setLoadingDetail(true);
fetchCellDetail(detailEvalId, output.id).then((detail) => {
if (!cancelled) {
if (detail) {
setCellDetail(detail);
}
setLoadingDetail(false);
}
});
return () => {
cancelled = true;
};
}
}, [showPrompts, cellDetail, loadingDetail, output.prompt, output.id, detailEvalId]);
// Auto-fetch prompt when "Show Prompts" is toggled on and prompt was stripped.
// Uses cleanup function to cancel stale responses during rapid cell/toggle changes.
React.useEffect(() => {
if (
!showPrompts ||
cellDetail ||
loadingDetail ||
output.prompt ||
!output.id ||
!detailEvalId
) {
return;
}
let cancelled = false;
const loadDetail = async () => {
setLoadingDetail(true);
try {
const detail = await fetchCellDetail(detailEvalId, output.id);
if (!cancelled && detail) {
setCellDetail(detail);
}
} finally {
if (!cancelled) {
setLoadingDetail(false);
}
}
};
loadDetail();
return () => {
cancelled = true;
setLoadingDetail(false);
};
}, [showPrompts, cellDetail, loadingDetail, output.prompt, output.id, detailEvalId]);
🤖 Prompt for AI Agents
In `@src/app/src/pages/eval/components/EvalOutputCell.tsx` around lines 186 - 211,
The useEffect that auto-fetches prompt details (React.useEffect) can leave
loadingDetail stuck because the cleanup only sets cancelled=true and the async
.then never runs setLoadingDetail(false); update the effect to use async/await
inside an inner async function called (e.g., fetchDetailAsync) that awaits
fetchCellDetail(detailEvalId, output.id), and in the cleanup always call
setLoadingDetail(false) as well as flip the cancelled flag; ensure you still
only call setCellDetail(detail) when not cancelled and that the effect
dependencies remain [showPrompts, cellDetail, loadingDetail, output.prompt,
output.id, detailEvalId].

Comment on lines +353 to +372
// Returns the full prompt, response, and testCase for a single result cell.
// The table endpoint strips these fields to keep payloads small; the frontend
// fetches them on demand when the user clicks "Show Prompt".
evalRouter.get(
'/:evalId/results/:resultId/detail',
async (req: Request, res: Response): Promise<void> => {
const { evalId, resultId } = EvalSchemas.ResultDetail.Params.parse(req.params);

const result = await EvalResult.findById(resultId);
if (!result || result.evalId !== evalId) {
res.status(404).json({ error: 'Result not found' });
return;
}

res.json({
prompt: result.prompt.raw,
response: result.response,
testCase: result.testCase,
});
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Wrap the result-detail handler with try/catch and {success,data/error} responses.

Uncaught Zod parse or DB errors will bubble, and the response shape doesn’t match the API contract.

🛠 Suggested fix
evalRouter.get(
  '/:evalId/results/:resultId/detail',
  async (req: Request, res: Response): Promise<void> => {
-    const { evalId, resultId } = EvalSchemas.ResultDetail.Params.parse(req.params);
-
-    const result = await EvalResult.findById(resultId);
-    if (!result || result.evalId !== evalId) {
-      res.status(404).json({ error: 'Result not found' });
-      return;
-    }
-
-    res.json({
-      prompt: result.prompt.raw,
-      response: result.response,
-      testCase: result.testCase,
-    });
+    try {
+      const { evalId, resultId } = EvalSchemas.ResultDetail.Params.parse(req.params);
+      const result = await EvalResult.findById(resultId);
+      if (!result || result.evalId !== evalId) {
+        res.status(404).json({ success: false, error: 'Result not found' });
+        return;
+      }
+
+      res.json({
+        success: true,
+        data: {
+          prompt: result.prompt.raw,
+          response: result.response,
+          testCase: result.testCase,
+        },
+      });
+    } catch (error) {
+      if (error instanceof z.ZodError) {
+        res.status(400).json({ success: false, error: z.prettifyError(error) });
+        return;
+      }
+      logger.error('[GET /:evalId/results/:resultId/detail] Failed to fetch result detail', {
+        error,
+        evalId: req.params.evalId,
+        resultId: req.params.resultId,
+      });
+      res.status(500).json({ success: false, error: 'Failed to fetch result detail' });
+    }
  },
);

As per coding guidelines, "src/server/routes/**/*.{ts,tsx}: Validate requests with Zod schemas from src/types/api/, wrap all responses in { success, data/error } format, handle errors with try-catch blocks in async route handlers."

🤖 Prompt for AI Agents
In `@src/server/routes/eval.ts` around lines 353 - 372, Wrap the async route
handler for evalRouter.get('/:evalId/results/:resultId/detail') in a try/catch;
validate params using EvalSchemas.ResultDetail.Params.parse as currently done,
but if parse throws return res.status(400).json({ success: false, error:
'<validation message>' }); after DB lookup (EvalResult.findById) return 404 via
res.status(404).json({ success: false, error: 'Result not found' }) when missing
or mismatched, and on success return res.json({ success: true, data: { prompt:
result.prompt.raw, response: result.response, testCase: result.testCase } });
catch any other errors and return res.status(500).json({ success: false, error:
String(err) }) to ensure all responses follow the {success, data/error} contract
and Zod/DB errors are handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments