Skip to content

feat: add waa entry (0.1.1)#70

Merged
recursix merged 4 commits into
mainfrom
add/waa-entry
Jun 19, 2026
Merged

feat: add waa entry (0.1.1)#70
recursix merged 4 commits into
mainfrom
add/waa-entry

Conversation

@recursix

Copy link
Copy Markdown
Collaborator

Adds waa-cube 0.1.1 (WindowsAgentArena, 152 tasks) to the catalog. Supersedes fork PR #65 (fork PRs can't receive the CI-derived-fields commit-back; main-repo branch resolves it). waa-cube 0.1.1 is now published via OIDC.

🤖 Generated with Claude Code

Cube Harness and others added 4 commits June 19, 2026 12:47
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Cube Harness <cube-harness@example.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Cube Harness <cube-harness@example.com>
@recursix recursix merged commit 5791411 into main Jun 19, 2026
1 check failed
@recursix recursix deleted the add/waa-entry branch June 19, 2026 17:01
recursix added a commit that referenced this pull request Jun 19, 2026
The entry-review job crashed with `TypeError: string indices must be
integers` when the model returned `checks` as a string instead of the
documented object (Anthropic tool-use does not hard-validate tool inputs
against the schema). The crash failed the workflow step, which skipped
both auto-merge and request-review, leaving the PR with a red check and
no feedback — observed on PR #70 (waa entry).

Add normalize_verdict() at the call_claude boundary so it always returns
a well-formed {verdict, checks, notes} dict: invalid/missing check values
default to "unverified", and an unparseable structure downgrades the
verdict to UNKNOWN (route to manual review, never auto-merge a result we
couldn't read). Mirrors the existing UNKNOWN fallback for API errors.

Signed-off-by: Alexandre Lacoste <alex.lacoste.shmu@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant