Skip to content

fix(coprocessor): scheduler, catch panic#2170

Merged
mergify[bot] merged 1 commit intomainfrom
rudy/fix/scheduler/panic
Mar 26, 2026
Merged

fix(coprocessor): scheduler, catch panic#2170
mergify[bot] merged 1 commit intomainfrom
rudy/fix/scheduler/panic

Conversation

@rudy-6-4
Copy link
Copy Markdown
Contributor

No description provided.

@rudy-6-4 rudy-6-4 requested a review from a team as a code owner March 25, 2026 09:52
@cla-bot cla-bot bot added the cla-signed label Mar 25, 2026
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Mar 25, 2026

Claude finished @rudy-6-4's request.View run

claude[bot]
claude bot previously approved these changes Mar 25, 2026
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AUTOMATED]

Code Review

PR: fix(coprocessor): scheduler, catch panic

Summary

This PR adds resilience to the FHE scheduler by wrapping run_computation() in std::panic::catch_unwind(), converting panics (e.g., from division/remainder overflow in TFHE operations) into structured SchedulerError::ExecutionPanic(String) errors. The changes are well-structured:

  1. scheduler.rs: Wraps the computation call with panic catching, extracts the panic message (handling &str, String, and unknown payloads), and properly logs + reports via telemetry.
  2. types.rs: Adds ExecutionPanic(String) variant, correctly removes Copy (since String isn't Copy), and adds Display formatting.
  3. tfhe_worker.rs: Updates from copy-dereference (*err.downcast_ref()) to .cloned() — necessary and correct change since SchedulerError is no longer Copy.
  4. Tests: Comprehensive test that triggers panics via div/rem overflow, verifies errors are properly caught and recorded in the DB with ExecutionPanic messages.

Review Results

Checked for bugs, security issues, guideline compliance, and error handling quality. No issues found.

Key observations:

  • The catch_unwind closure correctly satisfies UnwindSafe without needing AssertUnwindSafe — all captured values (i32, Vec, usize, &Vec<u8>) are inherently unwind-safe.
  • Panic payload extraction covers the two standard panic types (&str from literal panics, String from formatted panics) with a sensible fallback.
  • The ExecutionPanic error correctly falls through to set_computation_error() in the downstream worker — this is the desired behavior since panics during execution represent actual operation failures.
  • No panic = "abort" profile is configured, so catch_unwind will work as expected with the default unwind strategy.
  • The test infrastructure addition (errors_on_allowed_handles) is clean and reusable.

@claude
Copy link
Copy Markdown

claude bot commented Mar 25, 2026

The review has been posted. Here's a summary of what was done:

PR Review Complete: #2170 — fix(coprocessor): scheduler, catch panic

Review Process

  • Pre-review validation: PR is open, non-trivial — review proceeded
  • Bug Hunter (2x opus agents): No bugs found
  • Guideline Compliance: No CLAUDE.md or style guides found in repo — no violations
  • Error Handling Audit: 3 issues raised, all validated as by-design behavior or suggestions (not actual defects)
  • Validation: No issues passed the >= 80 confidence + validation threshold

Verdict: APPROVED

The PR correctly wraps FHE computation in catch_unwind, properly extracts panic messages, adds a new error variant, updates downstream code for the CopyClone change, and includes comprehensive tests. No bugs, security issues, or error handling defects were found.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 25, 2026

Changed Lines Coverage

Coverage of added/modified lines: N/A

@Eikix
Copy link
Copy Markdown
Contributor

Eikix commented Mar 26, 2026

clippy fails but otherwise, nice! lgtm

Copy link
Copy Markdown
Collaborator

@antoniupop antoniupop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@Mergifyio queue

@mergify
Copy link
Copy Markdown

mergify bot commented Mar 26, 2026

Merge Queue Status

This pull request spent 1 hour 40 minutes 15 seconds in the queue, including 1 hour 39 minutes 43 seconds running CI.

Required conditions to merge

mergify bot added a commit that referenced this pull request Mar 26, 2026
mergify bot added a commit that referenced this pull request Mar 26, 2026
mergify bot added a commit that referenced this pull request Mar 26, 2026
mergify bot added a commit that referenced this pull request Mar 26, 2026
@mergify mergify bot merged commit 884690e into main Mar 26, 2026
63 of 64 checks passed
@mergify mergify bot deleted the rudy/fix/scheduler/panic branch March 26, 2026 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants