Skip to content

fix: wait for async eval completion with polling#60

Merged
trieloff merged 3 commits intomainfrom
fix-eval-polling
Apr 10, 2026
Merged

fix: wait for async eval completion with polling#60
trieloff merged 3 commits intomainfrom
fix-eval-polling

Conversation

@trieloff
Copy link
Copy Markdown
Contributor

Problem

tessl eval run is async — it starts the eval server-side and returns immediately with a run ID. The workflow was calling tessl eval view --last 1 second later, before the run had even registered, getting 'No eval runs found' and treating it as success.

Additionally, the workflow used echo -e with \\n string concatenation for summary rows, which produces literal backslash-n in the GitHub step summary (same bug fixed in tessl-review.yml via #59).

Fix

  1. Poll for completion: After tessl eval run, extract the run ID (UUID) from stdout, then poll tessl eval view <run-id> --json every 30 seconds, checking the status field until it's completed or failed (timeout after 15 minutes).

  2. Use specific run ID: Replace tessl eval view --last with tessl eval view $RUN_ID for reliable result retrieval.

  3. Fix summary formatting: Use temp files (mktemp) instead of echo -e / backslash-n string concatenation, matching the pattern from Fix step summary table formatting in tessl-review workflow #59.

Changes

  • .github/workflows/tessl-eval.yml: Replaced eval + view logic with polling pattern

- Extract run ID from tessl eval run output
- Poll tessl eval view <run-id> --json every 30s until completed/failed
- Timeout after 15 minutes
- Use temp files instead of echo -e for summary rows
- View results using specific run ID instead of --last

Signed-off-by: Lars Trieloff <lars@trieloff.net>
@trieloff trieloff requested a review from shsteimer as a code owner April 10, 2026 11:33
Copilot AI review requested due to automatic review settings April 10, 2026 11:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the tessl-eval GitHub Actions workflow to reliably wait for asynchronous tessl eval run executions to finish before fetching results, and fixes step summary table formatting.

Changes:

  • Extracts the eval run ID from tessl eval run output and polls tessl eval view <run-id> --json until completion/failure (with timeout).
  • Switches result retrieval from tessl eval view --last to tessl eval view $RUN_ID.
  • Replaces echo -e/\n concatenation with temp files for GitHub Step Summary rows and error details.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/tessl-eval.yml Outdated
Comment on lines +173 to +175
STATUS_OUTPUT=$(tessl eval view "$RUN_ID" --json 2>&1) || true
EVAL_STATUS=$(echo "$STATUS_OUTPUT" | python3 -c "import sys,json; print(json.load(sys.stdin)['data']['attributes']['status'])" 2>/dev/null) || EVAL_STATUS="unknown"

Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STATUS_OUTPUT=$(tessl eval view "$RUN_ID" --json 2>&1) mixes stderr into stdout, which can easily break JSON parsing (and currently causes EVAL_STATUS to stay unknown until timeout). Capture JSON from stdout only, and handle non-zero exit codes / non-JSON responses explicitly (e.g., check exit status; treat auth/invalid run-id as immediate failure, and only retry on transient “not found yet” cases).

Suggested change
STATUS_OUTPUT=$(tessl eval view "$RUN_ID" --json 2>&1) || true
EVAL_STATUS=$(echo "$STATUS_OUTPUT" | python3 -c "import sys,json; print(json.load(sys.stdin)['data']['attributes']['status'])" 2>/dev/null) || EVAL_STATUS="unknown"
STATUS_STDOUT_FILE=$(mktemp)
STATUS_STDERR_FILE=$(mktemp)
VIEW_EXIT_CODE=0
tessl eval view "$RUN_ID" --json >"$STATUS_STDOUT_FILE" 2>"$STATUS_STDERR_FILE" || VIEW_EXIT_CODE=$?
if [ "$VIEW_EXIT_CODE" -eq 0 ]; then
EVAL_STATUS=$(python3 -c "import json,sys; print(json.load(open(sys.argv[1]))['data']['attributes']['status'])" "$STATUS_STDOUT_FILE" 2>/dev/null) || EVAL_STATUS="unknown"
if [ "$EVAL_STATUS" = "unknown" ]; then
echo " [$TILE_NAME] Poll at ${ELAPSED}s: received non-JSON or unexpected JSON from tessl eval view"
fi
else
STATUS_ERROR=$(cat "$STATUS_STDERR_FILE")
echo " [$TILE_NAME] Poll at ${ELAPSED}s: tessl eval view failed (exit code $VIEW_EXIT_CODE): $STATUS_ERROR"
if echo "$STATUS_ERROR" | grep -Eqi 'not[[:space:]-]*found|no such run|run.*not.*found'; then
EVAL_STATUS="unknown"
else
EVAL_STATUS="failed"
fi
fi
rm -f "$STATUS_STDOUT_FILE" "$STATUS_STDERR_FILE"

Copilot uses AI. Check for mistakes.
Agent-Id: agent-bd1dac50-8d8b-4d9a-8b1c-759c699a5f04
Linked-Note-Id: e750c364-0856-4558-bf2a-a1a9b8a59695
Agent-Id: agent-9880f236-a0b7-429f-8ca4-2336bf96bef0
Linked-Note-Id: f3d54d02-2e83-4bad-bb72-86e06cc75d07
@trieloff trieloff merged commit fccf6a7 into main Apr 10, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants