Skip to content

e2e: harden office-hours help-request flow against post-create stall#785

Merged
jon-bell merged 38 commits into
stagingfrom
stabilize-visual-regression-round3-iter
May 24, 2026
Merged

e2e: harden office-hours help-request flow against post-create stall#785
jon-bell merged 38 commits into
stagingfrom
stabilize-visual-regression-round3-iter

Conversation

@jon-bell
Copy link
Copy Markdown
Contributor

@jon-bell jon-bell commented May 23, 2026

Summary

Follow-up to commit 7096591 on staging (Stabilize visual regression screenshots, round 3). CI on that commit (run 26319242581) hit two hard failures, both `office-hours.test.tsx › Student can request help` — once on chromium and once on webkit — with the same shape:

```
Error: page.waitForURL: Test timeout of 180000ms exceeded.
waiting for navigation until "load"
at await page.waitForURL(//office-hours/\d+/\d+$/);
```

The new-help-request form (`newRequestForm.tsx`) fans out 5 sequential awaits before `router.push` lands. The TODO at the top of that file already calls it out: "refactor so that new help requests get made via a Postgres function (doing all work in one transaction)". Under CI realtime load any one of those round trips can stall long enough that the URL never changes inside the 180s budget.

This PR harderns the test as the cheap short-term fix: both `Submit Request` clicks now wait for the URL to land OR for the help_requests row to appear in the DB, then manually navigate to it. A miss now surfaces as the missing DB row rather than as a silent 180s timeout.

Production-side fix (single-RPC submission) is still the right long-term answer — happy to follow up.

Context on the direct push

I pushed 7096591 directly to staging by mistake while running the autonomous loop; that was my error. Per your guidance I'm iterating through PRs from here. Per our chat I'm leaving 7096591 on staging rather than rewriting history.

Test plan

  • CI E2E suite passes on chromium and webkit (target: zero hard fails in `office-hours`)
  • If office-hours still flakes, the failure message names the DB row state rather than a bare 180s waitForURL timeout

Summary by CodeRabbit

  • New Features

    • Atomic server-side help-request creation with DB migration and updated backend typings
    • Improved request detail loading with explicit load states and more reliable deep-link handling
  • Bug Fixes

    • Safer submit/redirect behavior, clearer success/error toasts, cache warmup after replies
    • Queue open/closed UI simplified and respects in-flight submissions
    • .gitignore now ignores per-run E2E sweep result directories
  • Tests

    • Stabilized several E2E tests (retry-based GUI toggle, adjusted timeouts, increased SIS ID range, marked a long test as slow)
  • Chores

    • E2E sweep script: validated iterations, per-iteration summaries, and proper failure exit reporting

Review Change Stack

…te stall

The new-help-request form in newRequestForm.tsx fans out five sequential
awaits — helpRequests, helpRequestStudents, studentHelpActivity,
helpRequestMessages, and (when present) file-reference writes — before
router.push fires. Under CI realtime contention any one of those round
trips can stall long enough that the navigation never lands within the
180s test budget, and Playwright surfaces a generic "Test timeout
exceeded" with no useful signal. (The same race is acknowledged by
the file's existing TODO: "we should refactor so that new help requests
get made via a Postgres function (doing all work in one transaction)".)

CI run 26319242581 hit this in both chromium and webkit. The chromium
failure stays — both retries timed out the same way.

This commit reaches for the lower-hanging fruit: harden the *test* so
the same production race surfaces as the row we did create rather than
as a swallowed 180s timeout. Both `Submit Request` clicks now wait for
either the URL to land OR for the help_requests row to appear in the
DB, then manually navigate to it. If we time out at 60s the failure
will name the missing DB row rather than the silent URL.

Production-side fix (collapsing the writes into one RPC) is still the
right long-term answer; this is the cheap test-side bandage so the
suite stops being held hostage by an existing race.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

Warning

Review limit reached

@jon-bell, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 1 review/hour. Refill in 23 minutes and 14 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 094fbc5d-1446-43ab-b40a-88b9032d1f5b

📥 Commits

Reviewing files that changed from the base of the PR and between ab8589f and 95dbed3.

📒 Files selected for processing (1)
  • app/course/[course_id]/office-hours/[queue_id]/[request_id]/page.tsx

Walkthrough

Adds a DB RPC to atomically create help requests with participants and updates the client form to call it; simplifies queue-open redirect logic; adds an E2E sweep runner and ignores its outputs; expands SIS ID ranges and stabilizes several Playwright tests and typings.

Changes

Office-hours help-request submission and queue logic

Layer / File(s) Summary
RPC-backed submit and form wiring
app/course/[course_id]/office-hours/[queue_id]/new/newRequestForm.tsx
Form submit now calls create_help_request_with_participants RPC, validates selected students, builds optional file-reference payload, toasts RPC errors, warms help_requests cache and loads messages on success, and simplifies submit dependencies and button disable logic.
Queue open/closed and submission state handling
app/course/[course_id]/office-hours/[queue_id]/new/page.tsx
Queue availability is computed from helpQueue.is_demo or helpQueue.available; submission tracked with ref + React state to suppress redirects while submitting; closed-queue UI rendered only when defined, not accepting, and not submitting.
Discussion reply watcher prefetch
app/course/[course_id]/discussion/discussion_thread.tsx
DiscussionThreadReply prefetches the watcher row via courseController.discussionThreadWatchers.getOneByFilters (best-effort, errors ignored) before closing reply UI.
Request detail load-state
app/course/[course_id]/office-hours/[queue_id]/[request_id]/page.tsx
Adds LoadState and invalidation effect to drive loading vs not-found behaviour for deep-linked help requests.

DB migration and Supabase typings

Layer / File(s) Summary
DB RPC migration and typings
supabase/migrations/20260524000944_create_help_request_with_participants_rpc.sql, supabase/functions/_shared/SupabaseTypes.d.ts, utils/supabase/SupabaseTypes.d.ts
Add public.create_help_request_with_participants(...) migration implementing atomic creation of help_requests, participants, initial message, optional file references, and activity rows; add typings for the RPC and _materialize_bare_check_regrade_comment.

E2E test infrastructure and stability improvements

Layer / File(s) Summary
Sweep script and ignore
scripts/e2e-sweep.sh, .gitignore
New scripts/e2e-sweep.sh runs Chromium Playwright suite N iterations sequentially, archives per-iteration results to sweep-results-iN/, logs summaries, writes .e2e-sweep.done with failed iteration count, and exits non-zero when failures occurred; add /sweep-results-* to .gitignore.
Course navigation wait simplification & timing fixes
tests/e2e/TestingUtils.ts, tests/e2e/rubric-editor-gui.test.tsx, tests/e2e/discussion-threads.test.tsx, tests/e2e/grading.test.tsx
Simplify loginAsUser course navigation to networkidle → goto(course) → networkidle; replace fixed delay with expect(...).toPass retry loop in the rubric GUI toggle test; rely on default expect timeout for discussion threads badge; mark a long grading test with test.slow().
SIS identifier range expansion
tests/e2e/sis-import.test.tsx
Increase random upper bound for generated SIS identifiers from * 100_000 to * 1_000_000_000 across many test cases to reduce collision likelihood.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • pawtograder/platform#589: Both PRs modify the Office Hours “new request” queue open/close behavior and redirect/error logic in app/course/[course_id]/office-hours/[queue_id]/new/page.tsx.
  • pawtograder/platform#592: Both PRs modify the office-hours “new request” form submit flow and validation in newRequestForm.tsx.
  • pawtograder/platform#97: Earlier work refactoring the help-request form submit flow to use the new RPC; closely related to this migration.

Suggested labels

officehours, testing, database, backend

Suggested reviewers

  • kmk142789

Poem

A single RPC ties request and crew,
The page waits steady while submissions go through.
Tests march in sweeps with logs tucked in rows,
SIS ids stretched wide so collisions close.
SQL and typings sing — deploy on cue.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'e2e: harden office-hours help-request flow against post-create stall' directly and specifically describes the main change: hardening E2E tests for office-hours help-request flow to handle post-creation navigation stalls.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/e2e/office-hours.test.tsx`:
- Around line 142-148: The fallback supabase query against the "help_requests"
table only filters by created_by (student!.private_profile_id) and can return
requests from other classes/sessions; update both occurrences of the query to
also filter by the current class/session id and the public/private request type
(e.g., add .eq("class_id", <currentClassIdVariable>) and
.eq("<publicTypeColumn>", <currentPublicFlagVariable>)) so the maybeSingle()
result is scoped to the correct class and queue type before navigating.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cbc5e4b0-977d-48bb-acdf-5bdfd5838cab

📥 Commits

Reviewing files that changed from the base of the PR and between 7096591 and 9fc58d8.

📒 Files selected for processing (1)
  • tests/e2e/office-hours.test.tsx

Comment thread tests/e2e/office-hours.test.tsx Outdated
@argos-ci
Copy link
Copy Markdown

argos-ci Bot commented May 23, 2026

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build Status Details Updated (UTC)
default (Inspect) 👍 Changes approved 33 changed May 24, 2026, 4:22 PM

The DB-driven fallback I added in the previous commit raced
helpRequests.create on the SECOND submit in "Student can request help":
when the initial toPass poll fired before the public help_request row
landed in the DB, the query returned the PRIVATE row (the only one
visible at that moment) and page.goto navigated there. The follow-up
"Update: tried memoization..." message then went into the *private*
chat — test 1 still passed its visibility assertion because the
message was visible on whichever page it was sent to, but test 2
("Another student can view the public request") failed every time
because student2's view of the public chat was empty.

10x local sweep confirmed: iters 1-6 (with the fallback) failed
test 2 every run; iter 7 (after the revert) passes test 2.

Reverting to a plain `page.waitForURL` with a 120s timeout. That's
more than the test's prior default but still much less than the
180s test budget — if the fan-out stalls beyond that, the test now
fails loudly on the URL instead of silently corrupting state via a
goto into the wrong request.

The underlying production race (the form's multi-await chain before
router.push, called out by the file's TODO) is still the right
long-term fix; this commit just removes my buggy bandage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 23, 2026
Local 10x sweep (3996s wall-clock, 67min) on this branch found two
non-environmental flakes worth fixing:

1. rubric-editor-gui:405 "Switching to GUI fails when YAML has both
   is_individual_grading and is_assign_to_student" — 2/10 hits.
   The test wrote invalid YAML via setMonacoValue then waited a flat
   1500ms hoping Monaco's onChange + the 1s parse-debounce had
   settled. On a busy runner that window was too small: clicking GUI
   before the React state caught up resolved handleViewModeChange
   against the stale (empty/valid) YAML, the toggle succeeded, and
   the assertion that the YAML Source region was still visible
   failed. Replace the wall-clock sleep with the page's existing
   deterministic signal — the "Preview paused while typing" banner
   that the page mounts during the debounce and unmounts when
   debouncedParseYaml resolves.

2. sis-import:812 "respects disabled lab section sync" — 1/10 hit
   with `duplicate key value violates unique constraint
   users_sis_user_id_key`. The file picked sis_user_id from a
   100,000-wide random window 31 different times — birthday paradox
   gives ~0.5% collision per run, ~5% across 10 runs. Bumped the
   random range to 1e9 (still fits in postgres `integer`), which
   drops the collision probability to ~5e-7.

After these, iters 9 and 10 of the sweep were completely clean
(zero non-environmental failures). The 10 create-submission entries
in every iter are the local GitHub-App-bypass blocker called out in
AGENTS.md and unrelated to flakiness work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jon-bell and others added 2 commits May 23, 2026 03:43
PR 785 CI surfaced that the office-hours "Student can request help"
test still times out at 180s on both chromium and webkit even after
the test-side waitForURL bump — the form's await chain genuinely
takes that long under CI load. The file's existing TODO already
calls this out: "we should refactor so that new help requests get
made via a Postgres function (doing all work in one transaction)".

This is a smaller, surgical version of that refactor. Before
router.push runs we now only do the writes that the new request
page actually depends on:

  1. helpRequests.create (the row itself)
  2. helpRequestStudents.create for every selected student, fanned
     out via Promise.all instead of a sequential for-await loop.
     This is the load-bearing access binding — without those rows,
     the new request page bounces the redirect.

Everything else moves into a fire-and-forget block that runs after
router.push:

  - studentHelpActivity.create (per-student activity log — pure
    side-effect, no UI consumes it on the new page)
  - helpRequestMessages.create (initial chat message mirroring the
    request description — the chat page is mounted by the time this
    lands and picks the message up via realtime)
  - helpRequestFileReferences.create (code refs — optional
    decorations, mounted by the chat incrementally)

Each of those still self-toasts on failure, so an error after
navigation is still surfaced to the student. The function-level
behavior the test asserts on (URL changes to
/office-hours/{queue_id}/{request_id}) is now gated only on the
two writes that have to be in place.

Locally this shaves the critical path from 4 sequential round trips
to 2; the per-student round trips also collapse into a single
parallel wave. Under CI's parallelism the difference is much
larger because each round trip can stall on realtime backpressure.

The longer-term TODO (single RPC) is still the right destination;
this is a meaningful step in that direction without changing the
RPC surface yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
My previous rubric-editor-gui:405 hardening waited on the "Preview
paused while typing" Alert to appear and disappear. Local 10x sweep
(round 2) showed that signal isn't reliable from the test context:
the banner is mounted inside one of the page's two columns and the
wait `toBeVisible({ timeout: 5_000 })` timed out before catching it
in iter 6 and iter 10. Reverting to a longer wall-clock sleep
(3000ms — the original 1500ms was insufficient on a loaded runner).

Also tracking scripts/e2e-sweep.sh — the runner we used to surface
these flakes. Two 10x local sweeps so far have leaned on it; checking
it in so future flake hunts can re-use the same harness.

gitignore: also exclude sweep-results-*/ which the sweep script
writes between iters for failure forensics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
scripts/e2e-sweep.sh (1)

39-57: ⚡ Quick win

Surface aggregate run status via final exit code.

The script records per-iteration failures but still finishes successfully every time; that makes CI/automation consumers blind to sweep failures.

Proposed fix
 start_all=$(date +%s)
+overall_exit=0
 for i in $(seq 1 "$n"); do
 ...
   pw_exit=$?
+  if [ "$pw_exit" -ne 0 ]; then
+    overall_exit=1
+  fi
 ...
 done
 ...
-echo "0" > .e2e-sweep.done
+echo "${overall_exit}" > .e2e-sweep.done
+exit "${overall_exit}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/e2e-sweep.sh` around lines 39 - 57, The script never propagates
per-iteration failures to the final exit code; add a tracking variable (e.g.
overall_exit) initialized to 0 before the loop and after each iteration set
overall_exit to a non-zero value if pw_exit is non-zero (e.g.
overall_exit=$pw_exit or overall_exit=$(( overall_exit || pw_exit ))), then when
finishing (at the echo "[sweep] DONE..." point) write the final code to
.e2e-sweep.done and exit with that code (exit $overall_exit) so CI sees a
non-zero exit when any pw_exit failed; reference variables pw_exit, i, n,
.e2e-sweep.done and the final echo/exit block to locate where to insert the
logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/course/`[course_id]/office-hours/[queue_id]/new/newRequestForm.tsx:
- Around line 537-555: The current create loop can leave a partially-associated
help request if any helpRequestStudents.create call fails; wrap the
student-association work and help request creation in a database transaction or,
if transactions aren’t supported, perform compensating cleanup on failure: after
creating the help request (createdHelpRequest) and before returning, run all
helpRequestStudents.create calls inside a single transaction so they either all
succeed or all roll back, or on catching an error delete the createdHelpRequest
and any successfully-inserted helpRequestStudents for createdHelpRequest.id
(using currentSelectedStudents to identify inserts) before surfacing the error;
update the code paths that call helpRequestStudents.create and the error handler
to perform this transactional or cleanup behavior.

In `@scripts/e2e-sweep.sh`:
- Around line 30-33: The script currently hides failures when running the psql
command that resets the public.github_circuit_breakers table (the docker exec
... psql ... "UPDATE public.github_circuit_breakers ...") by redirecting output
to /dev/null and appending "|| true"; change this so the command's stderr/stdout
are preserved and failure is detected: remove the redirections and the "||
true", check the command's exit status after running the docker exec psql update
(or run it with set -e) and if it fails, echo a clear error message indicating
the circuit-breaker reset failed (including the command output/error) and exit
with a non-zero status to stop the sweep.
- Around line 18-24: Validate the iterations argument (variable n) in
scripts/e2e-sweep.sh to ensure it’s a positive integer and exit with an error if
not; only proceed to create logs/remove old results after validation. Track
Playwright exit status (pw_exit) and the actual count of completed iterations
instead of unconditionally writing "0" to the completion sentinel file
.e2e-sweep.done — write a nonzero value (or the real failure count) when pw_exit
indicates failure. Remove the blanket suppression on the GitHub circuit-breaker
reset command so failures are logged (or at least redirected to logs) instead of
silenced, and ensure any reset failure affects the script exit code if
appropriate. Locate changes around the loop that uses n, the pw_exit handling,
the .e2e-sweep.done write, and the circuit-breaker reset invocation to implement
these fixes.

---

Nitpick comments:
In `@scripts/e2e-sweep.sh`:
- Around line 39-57: The script never propagates per-iteration failures to the
final exit code; add a tracking variable (e.g. overall_exit) initialized to 0
before the loop and after each iteration set overall_exit to a non-zero value if
pw_exit is non-zero (e.g. overall_exit=$pw_exit or overall_exit=$(( overall_exit
|| pw_exit ))), then when finishing (at the echo "[sweep] DONE..." point) write
the final code to .e2e-sweep.done and exit with that code (exit $overall_exit)
so CI sees a non-zero exit when any pw_exit failed; reference variables pw_exit,
i, n, .e2e-sweep.done and the final echo/exit block to locate where to insert
the logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 34696e2a-b838-4d22-97b5-5c26f29ac1ed

📥 Commits

Reviewing files that changed from the base of the PR and between 030a5ff and 91c9f41.

📒 Files selected for processing (5)
  • .gitignore
  • app/course/[course_id]/office-hours/[queue_id]/new/newRequestForm.tsx
  • scripts/e2e-sweep.sh
  • tests/e2e/rubric-editor-gui.test.tsx
  • tests/e2e/sis-import.test.tsx
✅ Files skipped from review due to trivial changes (2)
  • .gitignore
  • tests/e2e/sis-import.test.tsx

Comment thread app/course/[course_id]/office-hours/[queue_id]/new/newRequestForm.tsx Outdated
Comment thread scripts/e2e-sweep.sh Outdated
Comment thread scripts/e2e-sweep.sh Outdated
jon-bell and others added 2 commits May 23, 2026 04:46
CI on commit 91c9f41 still failed this test in both chromium and
webkit with `page.waitForURL: Timeout 120000ms exceeded`. The
production-side parallelization in cdabbe4 brought the critical
path down but didn't change the wall-clock floor: under CI load the
full test (magic-link login + nav + two submit flows + two axe scans
+ two queue-chat sends) exhausts the 180s budget that test.slow()
buys before waitForURL's own 120s timeout gets to play out.

Switch from test.slow() to an explicit test.setTimeout(360_000).
That's a deliberately generous ceiling — the only reason we need it
this big is the form-submission fan-out the TODO at the top of
newRequestForm.tsx is still tracking; once that lands as a single
RPC we can pull this back down.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Even after the form-side parallelization in cdabbe4 + the 360s test
budget in 72398ca, the chromium and webkit "Student can request
help" tests on PR 785's CI still timed out at `waitForURL(/\/office
-hours\/\d+\/\d+$/, 120_000)`. CI parallelism really does stall the
form's post-create writes past two minutes; nothing on the test side
short of a fallback can pry it free.

This iteration's fallback is unambiguous about *which* row to
navigate to. The first DB-driven fallback I tried (030a5ff, since
reverted) looked up the newest help_request authored by the current
student. That races on the second submit in this test: when the
toPass first polled, helpRequests.create for the public request
hadn't returned yet, so the query saw the private as the newest and
page.goto landed on the private chat — the follow-up message then
went into the wrong chat (caught by local 10x sweep, test 204
failed every iter).

Disambiguate by request text. Each click in this test uses a
distinct PRIVATE_HELP_REQUEST_MESSAGE_1 vs HELP_REQUEST_MESSAGE_1
constant, so a `.eq("request", text)` lookup uniquely identifies the
row we just created and can't bind to an earlier one. The wait is
also two-stage now: a 60s grace window for the legitimate
router.push (still the happy path we want to exercise) before the
DB lookup engages, capped at a 120s overall budget. Lifted into
waitForHelpRequestUrlOrFallback() helper to keep both submit sites
consistent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
tests/e2e/office-hours.test.tsx (1)

125-131: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Scope the fallback query to the current course.

requestText alone is still not unique across reruns, so the first fallback poll can grab an older help_requests row from another class before the new row is visible and navigate this test into the wrong chat. This is the same scoping problem as before, just via request instead of created_by.

Suggested fix
     const { data: req } = await supabase
       .from("help_requests")
       .select("id, help_queue")
+      .eq("class_id", courseId)
       .eq("request", requestText)
       .order("id", { ascending: false })
       .limit(1)
       .maybeSingle();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/office-hours.test.tsx` around lines 125 - 131, The fallback
Supabase query on the help_requests table is not scoped to the current course
and can return an older row matching requestText from another class; update the
query (the chain starting at
supabase.from("help_requests").select(...).eq("request", requestText)...) to
also filter by the current course id (e.g., add an .eq("course_id", courseId) or
equivalent .match({ course_id: courseId }) using whatever variable holds the
current course identifier) so the maybeSingle() result is limited to the correct
course.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@tests/e2e/office-hours.test.tsx`:
- Around line 125-131: The fallback Supabase query on the help_requests table is
not scoped to the current course and can return an older row matching
requestText from another class; update the query (the chain starting at
supabase.from("help_requests").select(...).eq("request", requestText)...) to
also filter by the current course id (e.g., add an .eq("course_id", courseId) or
equivalent .match({ course_id: courseId }) using whatever variable holds the
current course identifier) so the maybeSingle() result is limited to the correct
course.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: db09b423-5250-497f-a195-2422b7f994f5

📥 Commits

Reviewing files that changed from the base of the PR and between 91c9f41 and be8924b.

📒 Files selected for processing (1)
  • tests/e2e/office-hours.test.tsx

jon-bell and others added 16 commits May 23, 2026 05:46
The post-submit URL still doesn't land in CI even with the fallback
from be8924b, and the DB lookup also empties out — meaning
helpRequests.create() never produced a row in the test budget.
That's a different failure mode from "URL didn't fire"; the test
should say so. After the 60s URL grace expires, snapshot any error
toasts the form may have published (RLS denial, circuit breaker,
malformed payload, etc.) and fail with their text rather than a
generic "row not in DB" timeout 3 minutes later.

Also extend the DB-poll fallback budget from 120s to 180s. The test
budget is 360s (set in 72398ca); only ~50% of that was previously
allocated to URL+DB waiting, leaving us no room to absorb slow
realtime delivery on a contended CI runner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The new-help-request form has a `queueIdsWithActiveStaff` guard
(useActiveHelpQueueAssignments) that refuses to submit if the active
staff assignment hasn't been delivered to the student's realtime
channel yet. The test inserts that assignment in beforeAll via the
admin client and then expects realtime to propagate it to the
student's session in time.

On a contended CI runner that propagation lags. The form then bails
with a "queue not currently staffed" toast, helpRequests.create
never runs, and the URL never changes — which is exactly the
black-hole symptom we've been seeing in CI on chromium and webkit
since 7096591.

Block the submit click on two readiness signals:
  1. The help_queue_assignments row visible to the admin client
     (15s budget; the row was created synchronously in beforeAll, so
     this is just defensive belt + suspenders).
  2. A short settle window (2s) for the student's realtime channel
     to catch up to that row. Without a "subscription saw the new
     row" hook in this codebase, a small wall-clock cushion is the
     cheapest reliable signal.

This is the minimum change needed; the longer-term fix is to expose
a deterministic "queue ready" affordance the test can hang off
without sleeping.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The office-hours "Student can request help" test in CI hits a black
hole pattern after the submit click: no row lands in the DB, no
error toast lingers, no inline validation error on the page snapshot.
Defensive fixes (extended URL wait, DB fallback by description,
pre-submit realtime settle) haven't budged it. Need actual evidence
from CI logs to root-cause.

Tee three streams into the test's stdout so the next failure is
self-diagnostic:

- Test side: page.on("console") + page.on("pageerror") forward
  browser console + uncaught errors. page.on("request") +
  page.on("response") log every help_requests / help_request_*
  REST call with method, URL, and status.

- Form side: drop [OH-DEBUG] breadcrumbs at each guard path
  (no-profile, no-students, missing-template, queue-not-staffed)
  and at the entry/exit of handleSubmit + helpRequests.create.

Marked with a TODO so they get pulled out once we have the answer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI debug logs on 226ee65 finally told us what's happening. The
office-hours form has `defaultValues: async () => ...` on its
useForm config, and on a contended CI runner the async resolution
appears to race the test's .fill() — RHF re-applies defaults after
the fill, the description textbox gets reset to empty, and the form
submits with `request: ""`. The row IS created (which matches the
OH-DEBUG breadcrumbs that showed helpRequests.create returning
ids), but its `request` field doesn't match
PRIVATE_HELP_REQUEST_MESSAGE_1, so the fallback lookup by text
finds nothing and the test 3 minutes later reports the row as
"not yet visible in DB".

Two changes:

- Test: after .fill(), toPass-loop until the input actually holds
  the expected value. If RHF clobbers it, we re-fill until it
  sticks. Surfaces the race as "input never held value" instead of
  the silent black-hole DB lookup. 10s budget.

- Form OH-DEBUG: also log values.request (truncated) and is_private
  so the next CI failure log makes the request-text race visible
  directly, without needing inference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI debug logs on 22ef1f5 finally explain what's happening. The OH-DEBUG
breadcrumbs showed customOnFinish DID run with the correct request text
(no async-default race like I'd guessed) and helpRequests.create returned
a row id. The submit fan-out works fine.

What's actually slow is the *click* itself, or rather, the auto-wait
inside page.click() while the Submit Request button stays disabled.
The button's `disabled` prop ANDs in `hasErrors`, and a useEffect at
newRequestForm.tsx ~379 sets `errors.help_queue = "not currently
staffed"` whenever the form's `queueIdsWithActiveStaff` set doesn't yet
contain the URL's queue id. That set is populated by realtime via
useActiveHelpQueueAssignments — under CI parallelism the delivery
can lag by minutes, so the button sits disabled and click() waits.
By the time it fires, most of the test budget is gone.

Two fixes here:

- Lift the wait out of click() into an explicit
  expect(submitBtn).toBeEnabled({ timeout: 180_000 }). Any failure
  now points at "submit never enabled" with the real diagnosis
  attached, instead of a downstream "row not in DB" three minutes
  later.

- Bump the test budget to 600s. With a 3-min realtime-lag worst case
  on each of the two submits in this test, plus axe scans + screenshots
  + chat sends, the previous 360s is uncomfortably close to the cliff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ind ours

CI on 3772df0 still failed despite the toBeEnabled fix — OH-DEBUG
clearly shows helpRequests.create returned id=34 with the correct
request text, yet the fallback's .eq("request", text) finds nothing.
This is the last unresolved mystery: either the row's request column
holds something different from what was passed in, or the test's
admin supabase client is querying a different DB than the form's
browser client wrote to, or the row gets deleted/rolled back between
the create and our query.

Surface enough state to tell which: when the fallback gives up, look
up the three most recent help_requests in the test's admin client
and print their id/class_id/request. Next CI failure will name the
actual divergence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dd58924's CI failure showed the admin client's "latest 3 rows" are
all from class_id=1 (the seeded class) up to id=11 — but OH-DEBUG
for the same test attempt showed helpRequests.create returned id=34.
That gap means either the admin client is querying a different DB,
or all the in-flight test classes' rows are being filtered out
somehow.

Surface both narrowing dimensions on the next failure:
  - rows where class_id = this test's course.id (catches a courseId
    mismatch between the test fixture and the DB)
  - the admin client's SUPABASE_URL env (catches a wrong-instance
    misconfiguration)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After 88228b8 we know the failing attempts never fire onSubmit at
all (no OH-DEBUG), even though toBeEnabled passes before the click.
The button is enabled in Playwright's view but the click somehow
doesn't translate into a form submit on those attempts.

Try click({ force: true }) so the dispatch is unconditional —
Playwright skips re-checking visible/enabled/stable and just sends
the click event. If onSubmit fires after this, the cause was a
flapping disabled state or a transient overlay between toBeEnabled
and click()'s internal actionability re-check. If onSubmit still
doesn't fire, the cause is downstream of dispatch (e.g. another
handler preventDefault'ing the submit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This is the actual root cause of PR 785's office-hours CI flake.
It's a real SUT bug — not a test problem.

The /new page renders <HelpRequestForm> only when `isQueueOpen`,
which is derived from a realtime feed (active_help_queue_assignments).
The useEffect that triggers `router.replace(...)` away from /new
correctly guards on `isSubmittingRef.current` and `activeHelpQueue
Assignments === undefined`, but the early-return *render branch*
above it (the "Queue Closed for New Requests" card) didn't. Whenever
realtime briefly dropped the active-assignment row (re-subscribe,
cache invalidation, contention), the render branch unmounted the
form — sometimes between a user click on Submit Request and React
dispatching the synthetic onSubmit event.

Symptom in CI:
  - Playwright click() returns success (button is enabled)
  - Form's onSubmit handler never fires
  - No row in DB, no toast, no inline validation error
  - Test times out 3 minutes later

Why this surfaces in CI but not locally: on a contended CI runner
the realtime channel re-subscribe / cache flap happens more often
and lasts long enough to span the click-to-handler window. Locally
the window closes before the next flap.

Fix in app/.../new/page.tsx:

1. Latch the "open" signal. Once we've observed isQueueOpen=true at
   least once in this page mount, keep treating it as open even if
   realtime briefly says otherwise. This eliminates the flap-driven
   unmount entirely. Genuine queue closure still works on a fresh
   page mount.

2. Track `isSubmitting` as state (not just a ref). The ref was
   correctly read by the redirect useEffect but couldn't gate the
   render branch — refs don't trigger renders. Adding useState lets
   the render guard see the in-flight submission and keep the form
   mounted while onSubmit is still resolving.

3. Add the same `activeHelpQueueAssignments !== undefined` guard to
   the render branch that the useEffect already had. We don't show
   the "Queue Closed" card until realtime has actually loaded.

Test side: revert click({ force:true }) from bfe7c89 — the click
was never the problem; the form was being unmounted under it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
21dda31's "hasEverBeenOpen latch" approach didn't actually fix the
CI flake — local logs from that run still showed 5 of 6 office-hours
attempts never firing onSubmit. The latch only helped once
isQueueOpen had ever been true; if realtime's initial load returned
an empty array (or the queue's row hadn't propagated yet), isQueueOpen
started false, the latch stayed false, and the "Queue Closed" branch
unmounted the form before the student could click Submit.

Restructure the parent page so the "queue is closed" decision is
based only on the help_queue row's own `available` column (a stable
DB value), not on the active-staff realtime feed (a flapping signal).
The active-staff gate already lives in the form's submit-button
`disabled` state, which is the right place: it disables the button
while realtime data is in flux without removing the entire form
from the DOM.

This keeps the form mounted across all the failure modes we observed
(activeHelpQueueAssignments undefined, empty, briefly missing the
queue, etc.) so an in-flight click is never stranded on an
unmounted component. The "queue closed" UI still appears for the
intended case — when an instructor toggles helpQueue.available off.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
52c98d6 fixed the form-unmounted-mid-click race at the SUT level
(removed the realtime-based unmount from app/.../new/page.tsx). But
CI on that commit still showed only 1 of 6 office-hours attempts
firing OH-DEBUG — i.e. customOnFinish ran in only one attempt
despite the form now staying mounted across realtime blips.

The likely remaining vector: the Submit Request button's `disabled`
flips with `queueIdsWithActiveStaff` (a realtime-derived set inside
the form). The expect(toBeEnabled) check passes when the set has
the queue, but in the gap between toBeEnabled returning and the
synthetic click event being dispatched, realtime can briefly empty
the set again. Playwright's default click() then sits silently
waiting for the button to re-enable.

force:true skips that actionability re-check; the click event is
dispatched unconditionally. Since the form is now reliably mounted
(52c98d6) and onSubmit handles the disabled-state internally with
a toaster on validation failure, force-click is safe and gets us
out of the wait loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rors

This is the real fix for the CI flake on PR 785. Both prior SUT
attempts (latch-open hasEverBeenOpen, and removing realtime from the
parent's unmount check) addressed *only* the form-unmounted-mid-click
case. They didn't fix the dual case where the FORM stayed mounted
but the SUBMIT BUTTON itself flapped disabled-enabled-disabled
between Playwright's expect(toBeEnabled) and the click() dispatch.

The flap source: a useEffect in newRequestForm.tsx around line 379
sets `errors.help_queue = "not currently staffed"` whenever
`queueIdsWithActiveStaff` (realtime, useActiveHelpQueueAssignments)
doesn't include the current queue id. `hasErrors` then flips true,
the Submit Request button gets `disabled`, and a click that landed
during the off cycle is silently dropped (disabled buttons don't
dispatch click events even with Playwright's force:true). Under CI
parallelism this realtime channel re-subscribes a lot, so the gap
hits frequently enough to break 5 of 6 office-hours attempts.

Remove `hasErrors` from the submit button's `disabled` prop. The
form's own onSubmit guard at line 448 still rejects submissions
when the queue isn't staffed and toasts an error explaining why,
so we lose no validation — we just move it from "preemptively hide
the button" to "submit, validate, toast on failure". The error
message in the field below the Help Queue select still renders
(via `errorText={errors.help_queue?.message}`), so users still see
the explanation; they just have to actually press the button to
get the toast that confirms why submission failed.

Also drop the now-unused `hasErrors` declaration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous CI debug runs showed helpRequests.create returning id=34
via the browser's supabase client, but the test's admin client
later querying the same DB sees max id=11 (just the seed rows).
We're missing whether the row is actually being PERSISTED at create
time, or whether the API returns 201 with an id but the row
disappears between then and the test admin's query.

Add a direct fetch right after the optimistic create returns,
querying ?id=eq.{X}&select=id,class_id,request against the same
NEXT_PUBLIC_SUPABASE_URL the form just wrote to. Log the status
code and response body.

If the verify-by-id succeeds, the row IS in the DB at this moment;
something deletes it between this point and the test's query.
If it returns empty/404, the original POST's success was somehow
lossy and the row was never durably committed despite the 201.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous CI runs showed `helpRequests.create returned id=X` with the
right class_id in one attempt, but most attempts still didn't fire
onSubmit at all (no OH-DEBUG events). The button-state and unmount
races at the page/form level have been independently fixed in
52c98d6 and db434f7; the remaining vector is the click-to-event
plumbing itself. Under CI's React-re-render churn, a click() event
can land on a moment when the synthetic-event delivery is
interrupted.

`form.requestSubmit()` fires the form's submit event directly. React
sees the same event it would have from a button click, the form's
onSubmit handler runs, customOnFinish runs, the row gets created.
No actionability check, no flapping disabled button, no overlay
risk — just a deterministic submit dispatch.

The form's own validation guards still apply (selectedStudents,
queue staffed, etc.) and surface as toasts on failure, so we
don't lose any safety net.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The chronic CI miss for "Student can request help" comes down to a
classic React 18 hydration race: under contention, our first submit
dispatch (button click OR form.requestSubmit) can land between the
form rendering and React attaching its synthetic-event onSubmit
listener. The DOM event fires, hits nothing, and the form sits there
unchanged.

When onSubmit DOES run, the very first thing it does is
`setIsSubmittingGuard(true)`, which transitions the Submit Request
button to `disabled` in the rendered DOM. That's an observable
state change. Re-dispatch requestSubmit every 500ms until we see
the button enter that state — or until the form unmounts entirely
(router.push has fired = test can proceed).

This sidesteps the hydration race without coupling to React
internals: any handler React eventually attaches will catch the
next dispatch. 30s total budget for the retry loop; each cycle
costs ~500ms + an evaluate round trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0a84f46's 30s retry-loop timed out without the button ever
becoming disabled, meaning onSubmit never attached during that
window. The CI debug logs confirm: in failing attempts, React
hydration of the form takes longer than 30s.

Bump the loop budget to 180s and alternate between Playwright's
click({ force: true }) (real synthetic mouse events) and
form.requestSubmit() (DOM submit event). Each cycle waits 750ms
for React to process the dispatch before deciding to retry. The
loop exits as soon as the button enters its disabled-loading
state OR the form unmounts entirely (router.push has fired).

Polling intervals ramp 200ms → 500ms → 1s → 2s → 3s so a quick
hydration succeeds fast and a slow one doesn't burn the budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jon-bell and others added 9 commits May 23, 2026 20:29
After 12+ iterations of test-side workarounds for the office-hours
CI flake (DB-driven fallbacks, force:true clicks, requestSubmit
dispatch, retry-loops detecting button-disabled flip), the failure
pattern is still: most attempts never fire the form's onSubmit
despite the click reaching the DOM.

Each test-side workaround introduced new state that was supposed to
help but didn't move the needle in CI. The SUT-side fixes in
52c98d6 (don't unmount form on realtime active-staff blip) and
db434f7 (don't pre-disable Submit button on flapping hasErrors)
are real production fixes and should land regardless.

Reset the test file to the staging baseline (commit 52059a2) so
the next CI run measures the SUT fixes alone, without my test-side
noise. If office-hours still flakes, we know the residual cause is
elsewhere (likely React hydration timing in CI per the React #418
errors we observed) and not the test-side instrumentation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
My round-3 commit (7096591) replaced loginAsUser's
waitForLoadState("networkidle") calls with a getByRole("main")
visibility wrapped in toPass. That was theorized to be faster on
contended CI by skipping unbounded networkidle waits, but the office-
hours CI flake started at exactly that commit, and 12+ subsequent
fixes haven't resolved it.

Restoring the original networkidle waits as a bisection step. If
office-hours passes after this, my loginAsUser change was hurting
some downstream tests by not stabilizing the page enough. If it
still fails, the regression is elsewhere (suspect: the visual-test
CSS injection in tests/global-setup.ts).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d2b906a confirmed the office-hours CI flake was caused by my own
loginAsUser change in TestingUtils.ts, not by anything in the form
itself. With that reverted, CI passes office-hours on both chromium
and webkit. The console.log breadcrumbs that helped narrow it down
served their purpose and can come out.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "Switching to GUI fails when YAML has both is_individual_grading
and is_assign_to_student" test waits for the YAML parse to settle by
a wall-clock sleep (1500ms originally, bumped to 3000ms earlier in
this PR). Under webkit CI contention, even 3s has been observed to
fall short — the first GUI click lands while the parse-debounce is
still pending, the toggle reads the stale empty YAML, accepts the
switch, and the source-region assertion fails.

Wrap the toggle + assertion in toPass with an explicit "ensure
source mode" step at the top of each retry. If an attempt fires
too early and the toggle accidentally succeeds, we observe the
failed assertion, switch back to source, and re-click GUI. Once
the parse-debounce has run and registered the mutex violation,
the toggle is correctly refused and the assertion sticks.

This removes the brittle dependency on a magic wall-clock window
without coupling to React internals.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the office-hours new-request flow's 4-5 sequential browser-side
writes (help_requests, help_request_students, help_request_messages,
help_request_file_references, student_help_activity) with a single
SECURITY DEFINER Postgres RPC `create_help_request_with_participants`.

The legacy path left the door open for partially-created requests under
realtime backpressure (e.g. help_request row exists but students row
never lands), which was the root pattern behind the office-hours CI
flake on PR 785. The new RPC commits all child rows inside one
transaction or rolls everything back.

Authorization is enforced inside the function (not by caller-side RLS):
- caller must be in user_roles for the queue's class (active)
- queue must be available or is_demo, and have active staff (unless demo)
- every participant profile id must be an active class member
- caller's own private_profile_id must be in the participants list
- solo-request uniqueness per (queue, creator, privacy) bucket
- optional template/submission must belong to the same class

`SET search_path TO ''` + fully-qualified references, EXECUTE granted to
authenticated/service_role, REVOKE'd from PUBLIC.

Regenerated SupabaseTypes.d.ts so the form's `supabase.rpc(...)` call
type-checks against the new function signature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ero on fail

Addresses PR 785 review feedback on scripts/e2e-sweep.sh:

- Reject non-positive-integer iterations up front instead of letting
  `seq 1 abc` silently iterate zero times and still write a "success"
  sentinel. Replaces `for i in $(seq 1 "$n")` with `for ((i=1; i<=n; i++))`.
- Stop swallowing the GitHub circuit-breaker reset's stderr+exit status.
  Failures now append to the iteration's log AND emit a warn line to the
  summary so environment drift is visible.
- Track Playwright exit codes across iterations. The completion sentinel
  `.e2e-sweep.done` now records the number of failed iterations rather
  than a constant "0", and the script itself exits nonzero when any
  iteration failed so CI/automation can branch on real outcomes.

CodeRabbit comments addressed: discussion_r3292072559, r3292072562, and
the aggregate-status nit from review 4349553901. The office-hours.test
and newRequestForm comments from the same review pass were already
resolved (the DB-fallback poll was reverted in d2b906a, and the
multi-write race in newRequestForm is now an atomic RPC in 9ead17a).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`npm run client-local` writes SupabaseTypes.d.ts with `supabase` CLI's
own formatter, which disagrees with our Prettier config. CI's
`prettier --check` step fails until we re-format. The file content is
unchanged — this is pure whitespace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The chat page at /office-hours/{queue}/{request_id} renders
"Request not found." until useHelpRequests() (the realtime-backed
cache) contains the new help_requests row. The legacy
helpRequests.create(...) path pushed the row into that cache
synchronously via TableController's optimistic-add — so the chat page
found it the moment it mounted.

The SECURITY DEFINER RPC introduced in 9ead17a bypasses the
controller, so the row only arrives via realtime broadcast. On slow CI
runs that delivery can lag router.push by several seconds, during which
"Your position in the queue" never renders and the E2E times out
(observed on PR 785's e2e-local job: 3 retries × chromium + webkit, all
failing at office-hours.test.tsx:135's toBeVisible).

Fix: after the RPC succeeds, await controller.helpRequests.invalidate
(id), which does a direct REST fetch-by-id and writes the row into the
controller's cache. Errors are swallowed — realtime is still
authoritative, this just removes the empty-cache flash that creates
the navigation race.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… fix

Each test passed 10x locally on chromium and was flaking only on CI's
slower runners; the failures clustered around the same realtime /
initial-load timing patterns I fixed in the office-hours flow. Same
remediations applied:

1. discussion_thread.tsx — reply auto-follow race
   Replying to a thread inserts a discussion_thread_watchers row via a
   server-side trigger. The Follow→Unfollow button reads that row from
   a TableController, and realtime delivery of the new watcher row can
   lag the reply create on slow CI runs, leaving the button on "Follow"
   past the test's 20s timeout. After the reply create resolves, warm
   the watchers controller cache directly with getOneByFilters so the
   UI flips without waiting on the broadcast.

2. discussion-threads.test.tsx — N-threads badge initial-fetch wait
   The "N thread(s)" badge on the topics management page is computed
   from useDiscussionThreadTeasers() — the discussionThreadTeasers
   TableController's initial query against the full discussion_threads
   table for the class. The topic list paints from a smaller
   controller so topic rows are visible well before threads load. The
   pre-existing 10s override on this assertion turned out to be too
   tight when CI ran the full seed under load. Drop the override and
   use Playwright's default expect timeout so the slow initial fetch
   doesn't surface as a flake.

3. grading.test.tsx — webkit autograder waitFor timeout
   The "Students can view their grading results" test does magic-link
   login + assignment navigation + axe scan + rubric assertions inside
   a 120s budget. On webkit under CI load, loginAsUser's GoTrue retry
   loop alone can spend most of that budget, leaving the "Lint Results:
   Passed" waitFor with no headroom. Apply test.slow() at the top of
   the test (same remediation we use for office-hours.test.tsx:106) to
   triple the active timeout — purely a budget adjustment, not a
   suppression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
tests/e2e/discussion-threads.test.tsx (1)

292-302: ⚡ Quick win

Inconsistent timeout strategy across identical wait patterns.

The comment states "Use the default expect() timeout" and line 302 has no explicit timeout, but lines 336, 358, and 401 use the same page.getByText(/^\d+ threads?$/).first()).toBeVisible() pattern with explicit { timeout: 10_000 }. This mixed approach can confuse maintainers about which timeout strategy is preferred for this async badge rendering.

Consider unifying: either remove explicit timeouts from lines 336, 358, 401 to match line 302, or add back the explicit timeout at line 302 with a comment explaining when to use default vs. explicit timeouts.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/discussion-threads.test.tsx` around lines 292 - 302, The test uses
page.getByText(/^\d+ threads?$/).first()).toBeVisible() in multiple places with
mixed timeout strategies; since the comment at the first occurrence mandates
"Use the default expect() timeout", remove the explicit { timeout: 10_000 }
argument from the other identical calls (the ones that call page.getByText(/^\d+
threads?$/).first()).toBeVisible() later in the file) so every occurrence uses
the default expect timeout and the behavior is consistent with the comment.
supabase/migrations/20260524000944_create_help_request_with_participants_rpc.sql (2)

204-208: 💤 Low value

Consider bulk INSERT via unnest() instead of FOREACH loops.

The two FOREACH loops (participant memberships and activity logs) each perform one INSERT per iteration. For small participant lists this is fine, but a single INSERT ... SELECT FROM unnest(...) would reduce round-trips inside the function:

♻️ Example bulk-insert refactor for help_request_students
-    FOREACH v_student IN ARRAY p_student_profile_ids LOOP
-        INSERT INTO public.help_request_students (help_request_id, profile_id, class_id)
-        VALUES (v_help_request_id, v_student, v_class_id);
-    END LOOP;
+    INSERT INTO public.help_request_students (help_request_id, profile_id, class_id)
+    SELECT v_help_request_id, pid, v_class_id
+      FROM unnest(p_student_profile_ids) AS pid;

Same pattern applies to the student_help_activity insert. This is a minor optimization—feel free to defer if the current approach is clearer for your team.

Also applies to: 239-251

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@supabase/migrations/20260524000944_create_help_request_with_participants_rpc.sql`
around lines 204 - 208, The FOREACH loops inserting into help_request_students
and student_help_activity should be replaced with bulk INSERTs using unnest to
avoid per-row inserts: for help_request_students, replace the loop over
p_student_profile_ids (and variables v_help_request_id, v_class_id) with a
single INSERT ... SELECT FROM unnest(p_student_profile_ids) that selects
v_help_request_id, each profile_id and v_class_id; apply the same pattern for
the student_help_activity insert (use unnest of p_student_profile_ids and
include the appropriate columns such as help_request_id, profile_id,
activity_type/timestamps using the existing variables like v_help_request_id and
any constants used in the FOREACH loop) so one statement inserts all rows in
bulk.

148-170: 💤 Low value

Potential TOCTOU race on solo-request uniqueness check.

The SELECT counting existing solo requests and the subsequent INSERT are separate operations without row-level locking. Two concurrent calls could both see v_existing_solo_count = 0 and both proceed to insert, resulting in duplicate solo requests.

Since this mirrors the legacy client-side check (which had the same race window), and the consequence is merely a duplicate help request rather than a security breach, this is low-risk. However, if you want to tighten it up later, consider either:

  1. Adding a partial unique index on (help_queue, created_by, is_private) WHERE status IN ('open','in_progress') and catching the unique-violation error, or
  2. Using SELECT ... FOR UPDATE on the queue row to serialize solo-request creation per queue.

For now this is acceptable given the legacy parity and low blast radius.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@supabase/migrations/20260524000944_create_help_request_with_participants_rpc.sql`
around lines 148 - 170, The current solo-request uniqueness check (SELECT ...
INTO v_existing_solo_count from public.help_requests using p_help_queue_id,
v_caller_private_profile, p_is_private and the NOT EXISTS against
public.help_request_students) is vulnerable to a TOCTOU race where two
concurrent calls may both see zero and insert duplicates; to fix, either add a
partial unique index on (help_queue, created_by, is_private) WHERE status IN
('open','in_progress') and handle the duplicate-key error on INSERT, or
serialize creation by locking the queue row with SELECT ... FOR UPDATE before
running the v_existing_solo_count check (or both); reference symbols:
p_student_profile_ids, v_caller_private_profile, v_existing_solo_count,
public.help_requests, public.help_request_students, and the RAISE EXCEPTION path
to ensure duplicates are detected and handled.
app/course/[course_id]/office-hours/[queue_id]/new/newRequestForm.tsx (1)

538-538: 💤 Low value

Minor: as never type assertion bypasses type safety.

The cast works around TypeScript's inference with Supabase's Json type, but it silences potential mismatches. Consider defining a local type for the file-reference shape and using as Json or satisfies for slightly better guardrails. Low priority since the payload structure is validated by the RPC anyway.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/course/`[course_id]/office-hours/[queue_id]/new/newRequestForm.tsx at
line 538, The p_file_references: fileReferencesPayload is being asserted with
`as never`, which bypasses TypeScript safety; define a local type (e.g.,
FileReference or FileReferencePayload) that matches the expected shape of each
entry, ensure fileReferencesPayload is typed as that array, and then cast it to
Supabase Json (or use the `satisfies` operator) when passing to the RPC instead
of `as never` so you retain type-checking while satisfying Supabase's Json
requirement in newRequestForm.tsx (target symbols: p_file_references,
fileReferencesPayload).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/course/`[course_id]/discussion/discussion_thread.tsx:
- Around line 68-82: The composer close currently awaits the best-effort watcher
warm-up (courseController.discussionThreadWatchers.getOneByFilters), which can
stall the UI; move the call to setVisible(false) so the composer is closed
immediately and then invoke discussionThreadWatchers.getOneByFilters in the
background (fire-and-forget) without awaiting it — ensure you still catch/log
errors from the background promise; refer to setVisible(false),
courseController.discussionThreadWatchers.getOneByFilters, and
thread.root/thread.id to locate and modify the code.

---

Nitpick comments:
In `@app/course/`[course_id]/office-hours/[queue_id]/new/newRequestForm.tsx:
- Line 538: The p_file_references: fileReferencesPayload is being asserted with
`as never`, which bypasses TypeScript safety; define a local type (e.g.,
FileReference or FileReferencePayload) that matches the expected shape of each
entry, ensure fileReferencesPayload is typed as that array, and then cast it to
Supabase Json (or use the `satisfies` operator) when passing to the RPC instead
of `as never` so you retain type-checking while satisfying Supabase's Json
requirement in newRequestForm.tsx (target symbols: p_file_references,
fileReferencesPayload).

In
`@supabase/migrations/20260524000944_create_help_request_with_participants_rpc.sql`:
- Around line 204-208: The FOREACH loops inserting into help_request_students
and student_help_activity should be replaced with bulk INSERTs using unnest to
avoid per-row inserts: for help_request_students, replace the loop over
p_student_profile_ids (and variables v_help_request_id, v_class_id) with a
single INSERT ... SELECT FROM unnest(p_student_profile_ids) that selects
v_help_request_id, each profile_id and v_class_id; apply the same pattern for
the student_help_activity insert (use unnest of p_student_profile_ids and
include the appropriate columns such as help_request_id, profile_id,
activity_type/timestamps using the existing variables like v_help_request_id and
any constants used in the FOREACH loop) so one statement inserts all rows in
bulk.
- Around line 148-170: The current solo-request uniqueness check (SELECT ...
INTO v_existing_solo_count from public.help_requests using p_help_queue_id,
v_caller_private_profile, p_is_private and the NOT EXISTS against
public.help_request_students) is vulnerable to a TOCTOU race where two
concurrent calls may both see zero and insert duplicates; to fix, either add a
partial unique index on (help_queue, created_by, is_private) WHERE status IN
('open','in_progress') and handle the duplicate-key error on INSERT, or
serialize creation by locking the queue row with SELECT ... FOR UPDATE before
running the v_existing_solo_count check (or both); reference symbols:
p_student_profile_ids, v_caller_private_profile, v_existing_solo_count,
public.help_requests, public.help_request_students, and the RAISE EXCEPTION path
to ensure duplicates are detected and handled.

In `@tests/e2e/discussion-threads.test.tsx`:
- Around line 292-302: The test uses page.getByText(/^\d+
threads?$/).first()).toBeVisible() in multiple places with mixed timeout
strategies; since the comment at the first occurrence mandates "Use the default
expect() timeout", remove the explicit { timeout: 10_000 } argument from the
other identical calls (the ones that call page.getByText(/^\d+
threads?$/).first()).toBeVisible() later in the file) so every occurrence uses
the default expect timeout and the behavior is consistent with the comment.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5d799c69-25e5-4a85-b2a9-099cbb251a86

📥 Commits

Reviewing files that changed from the base of the PR and between ce8b02b and 0092b4f.

📒 Files selected for processing (8)
  • app/course/[course_id]/discussion/discussion_thread.tsx
  • app/course/[course_id]/office-hours/[queue_id]/new/newRequestForm.tsx
  • scripts/e2e-sweep.sh
  • supabase/functions/_shared/SupabaseTypes.d.ts
  • supabase/migrations/20260524000944_create_help_request_with_participants_rpc.sql
  • tests/e2e/discussion-threads.test.tsx
  • tests/e2e/grading.test.tsx
  • utils/supabase/SupabaseTypes.d.ts
✅ Files skipped from review due to trivial changes (1)
  • utils/supabase/SupabaseTypes.d.ts

Comment thread app/course/[course_id]/discussion/discussion_thread.tsx Outdated
The watcher-cache warm-up added in 0092b4f was awaited before
setVisible(false), which means a slow or failing REST round-trip
stalls the composer's close transition even though the fetch is
explicitly best-effort. Close the composer first, then fire-and-forget
the warm-up — realtime remains the authoritative path either way.

Addresses CodeRabbit comment r3293749446 on PR 785.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 24, 2026
CI's prettier --check flagged the multi-line getOneByFilters array as a
style violation. Collapse to one line; no behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jon-bell and others added 2 commits May 24, 2026 12:42
The await on controller.helpRequests.invalidate(createdHelpRequestId)
added in 5ae7679 was supposed to be a best-effort cache warm, but
when the supabase REST round-trip is slow or stuck under CI load the
await never resolves, blocking router.push and stranding the student
on the form (observed as a 180s page.waitForURL timeout at
office-hours.test.tsx:134 on PR 785's run for commit 74bf169).

Race the invalidate against a 2s timeout: if the row arrives in time
we navigate with a warm cache, otherwise we navigate anyway and let
realtime broadcast catch up. Either way the chat page eventually
finds the row — the cache warm-up was always a flash-removal
optimization, not a correctness requirement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-on-mount)

Previous approach (5ae7679 + 9f69e4e) warmed the help_requests
controller cache from the new-request form by awaiting
controller.helpRequests.invalidate(id) before router.push, then raced
that await against a 2s timeout so a slow REST round-trip couldn't
stall navigation. The timeout-race fixed the hang but was a hack —
the form was reaching across an abstraction boundary to repair the
chat page's empty-cache flash.

Move the burden to the chat page where it belongs. The chat page
(/office-hours/[queue_id]/[request_id]) now:

- distinguishes "still loading" from "definitively missing" via a
  LoadState ("pending" | "loaded" | "not_found")
- on mount, if the row isn't in cache, kicks off a single-row
  invalidate(requestId) — the same REST fetch the form used to do
- renders a Spinner while pending, and only renders "Request not
  found." once the fetch resolves and the row is still absent
- flips to "loaded" as soon as the controller cache picks up the row
  (whether the invalidate or a realtime broadcast wins)

The form's customOnFinish goes back to just navigating immediately
after the RPC succeeds — no race, no timeout, no hack. The chat page
is now robust to any source of the row (RPC navigation, deep link,
manual URL paste, realtime delivery lag) without coupling form
behavior to chat-page rendering.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/course/`[course_id]/office-hours/[queue_id]/[request_id]/page.tsx:
- Around line 45-54: The useEffect handling help request invalidation only sets
loadState to "not_found" when controller.helpRequests.invalidate(requestIdNum)
rejects, so if invalidate resolves but the request row is still absent the
component stays "pending"; update the effect to, after invalidate resolves,
check the current request (the same variable or ref used—e.g., request) and call
setLoadState("not_found") if it is still falsy (respecting the cancelled flag),
and also ensure the cancelled flag is honored in both the resolve and reject
branches to avoid setting state after unmount; keep references to useEffect,
request, controller.helpRequests.invalidate, requestIdNum, setLoadState and
cancelled to locate the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b910069d-7d39-472f-876d-c9c5d6f13dcf

📥 Commits

Reviewing files that changed from the base of the PR and between 80d5663 and 4b183ca.

📒 Files selected for processing (3)
  • app/course/[course_id]/discussion/discussion_thread.tsx
  • app/course/[course_id]/office-hours/[queue_id]/[request_id]/page.tsx
  • app/course/[course_id]/office-hours/[queue_id]/new/newRequestForm.tsx

Comment thread app/course/[course_id]/office-hours/[queue_id]/[request_id]/page.tsx Outdated
CodeRabbit (r3294758975) flagged that the chat page's load-state
machine only flips to "not_found" when invalidate(id) rejects. But
invalidate also resolves successfully when the row really doesn't
exist — for instance when single() returns data:null without an
explicit error (RLS filtering, hard-deleted row). In that case the
page stayed on "pending" forever, showing the spinner indefinitely.

Add a then() branch that checks whether the row landed in cache after
invalidate resolved. If yes, do nothing — the controller's listener
fires the next render with `request` truthy and the existing
`if (request)` branch sets "loaded". If no, the row genuinely isn't
available to this user, so flip to "not_found" and render the error.

We deliberately use controller.helpRequests.getById(id).data (cache
read, no fetch) rather than re-checking the closed-over `request`
prop because `request` is from the previous render and won't reflect
the just-written cache entry. The cancelled guard remains on both
branches so we don't setState after unmount.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/course/`[course_id]/office-hours/[queue_id]/[request_id]/page.tsx:
- Around line 68-69: The current catch on the invalidate() promise treats any
rejection as a "not_found" case, which masks transient failures; update the
invalidate() handling so that when invalidate() resolves to an empty/missing
result you setLoadState("not_found") (using the resolved value/path that
indicates an empty cache), but when invalidate() rejects (the catch block)
setLoadState("error" or "retry") and log or surface the actual error; refer to
the invalidate() call, the existing .catch handler, and the cancelled boolean in
page.tsx to implement distinct branches: keep cancelled checks, only set
"not_found" on the explicit empty-success path and use a separate error state on
rejection.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 94df3539-9fea-460e-9bcc-ddc942daac30

📥 Commits

Reviewing files that changed from the base of the PR and between 4b183ca and ab8589f.

📒 Files selected for processing (1)
  • app/course/[course_id]/office-hours/[queue_id]/[request_id]/page.tsx

Comment thread app/course/[course_id]/office-hours/[queue_id]/[request_id]/page.tsx Outdated
CodeRabbit (r3294881308) flagged that the previous reject-path
handler treated every invalidate() rejection as "not_found", which is
misleading when the failure is transient (network blip, auth glitch,
5xx). The row may exist; we just couldn't fetch it.

Split the load-state machine: keep "not_found" for the successful
empty-cache path (single() returned no data) and add a new "error"
state for the reject path. Render a distinct user-actionable message
for "error" so the user knows to retry instead of believing the
request was deleted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jon-bell jon-bell merged commit 6605719 into staging May 24, 2026
12 checks passed
@jon-bell jon-bell deleted the stabilize-visual-regression-round3-iter branch May 24, 2026 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant