chore(webarena-verified): bump registry entry to 1.1.0#64
Conversation
Sync the registry entry to the latest published webarena-verified-cube release (1.1.0 on PyPI). The version field must match the published PyPI version. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Cube Harness <cube-harness@example.com>
|
This is a packaging/compat fix in cube-browser-tool ↔ cube-standard + a cube republish, not a registry change. Holding this PR until a fresh |
…he default-config import fix) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Cube Harness <cube-harness@example.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Cube Harness <cube-harness@example.com>
…a 1.1.0) Sync displayed version to the published cubes. (webarena handled separately in #64.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Cube Harness <cube-harness@example.com>
Entry review —
|
| Check | Result |
|---|---|
description_matches_package |
✅ pass |
authors_consistent_with_git |
|
no_id_squat_vs_existing |
✅ pass |
no_brand_impersonation |
✅ pass |
wrapper_license_plausible |
✅ pass |
Notes:
Entry is for an update/existing id webarena-verified (already in registry IDs list), so not a new-id squat. README confirms cube-harness wraps WebArena among supported benchmarks ("MiniWob, WebArena, OSWorld") — description of 812 tasks / 6 web platforms is consistent with the published WebArena benchmark (arxiv 2406.11955 is the linked paper). No brand impersonation: WebArena Verified is a faithful-port style name; benchmark_license source_url points to ServiceNow/webarena-verified, plausible upstream.
Could not verify: PyPI page empty (package may be unpublished — no PyPI long description or license field to cross-check). Author git history for the cubes/webarena-verified subdirectory not directly readable here; handles (NicolasAG, younik, recursix, manuel-delverme) are plausible ServiceNow/cube-harness contributors but unverified against commit history. known-authors.yaml has no entry for this id.
wrapper_license MIT is a valid SPDX id; benchmark_license reported Apache-2.0 with verified_by_original_authors:false — could not fetch the linked LICENSE file but the claim is internally consistent. No fail-level evidence; unverified items do not block judging the core claims (this is an existing-id update with matching scope).
Syncs the registry entry version to the latest published
webarena-verified-cuberelease (1.1.0on PyPI; the entry was at1.0.0). The schema requiresversionto match the published PyPI version, and quick-check installspackage==version. Validated againstregistry-schema.jsonlocally.