Skip to content

feat: add swegym entry (full SWE-Gym 2438 + official lite subset)#63

Merged
recursix merged 4 commits into
mainfrom
add/swegym-entry
Jun 18, 2026
Merged

feat: add swegym entry (full SWE-Gym 2438 + official lite subset)#63
recursix merged 4 commits into
mainfrom
add/swegym-entry

Conversation

@recursix

Copy link
Copy Markdown
Collaborator

Adds the swegym registry entry — full SWE-Gym (2438 tasks) with the 230-task lite split as an official named subset (cube-harness#504, package swegym-cube).

Submitted as a same-repo branch (I have admin) rather than a fork, so quick-compliance can push the CI-derived fields back — the fork variant (#62) passes compliance but fails the write-back step (GITHUB_TOKEN can't push to a fork branch). #62 will be closed as superseded.

Validated locally + in CI on #62: ✅ Schema valid, ✅ swegym-cube==0.1.0 installed, ✅ SWEGymBenchmarkConfig, task_count 2438, has_debug_task True (2), ✅ Quick check PASSED.

Note: dev_install_url pins @feat/swegym-lite-cube until cube-harness#504 merges; drop the ref (→ default branch) after merge. The cube depends on cube-standard>=0.1.0rc9 (now on PyPI), which is what unblocked this.

🤖 Generated with Claude Code

Cube Harness and others added 2 commits June 18, 2026 15:08
Same-repo PR (not a fork) so quick-compliance can write CI-derived fields back —
the fork-PR variant (#62) passes the compliance check but fails the write-back
step (GITHUB_TOKEN can't push to a fork branch).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Cube Harness <cube-harness@example.com>
@recursix recursix mentioned this pull request Jun 18, 2026
@github-actions

Copy link
Copy Markdown

Entry review — swegym

Verdict: CONCERN

Check Result
description_matches_package ⚠️ unverified
authors_consistent_with_git ⚠️ unverified
no_id_squat_vs_existing ✅ pass
no_brand_impersonation ✅ pass
wrapper_license_plausible ⚠️ unverified

Notes:

Core claims could not be verified. PyPI package swegym-cube appears unpublished: empty summary, empty license field, no long description. The dev_install_url points at a feature branch (cube-harness@feat/swegym-lite-cube#subdirectory=cubes/swegym-cube) and the entry's own NOTE says it is blocked until both a cube-harness PR merges and cube-standard rc9 lands on PyPI — so the package state is explicitly pre-merge/incomplete. No public README reachable for the cube wrapper repo, so description-vs-package consistency cannot be confirmed.

authors: only author is recursix (Alexandre Lacoste). The cube subdirectory (cubes/swegym-cube) is on a private/unreachable feature branch; could not confirm this handle appears in commit history for the wrapper. recursix is not in known-authors.yaml for this id. Unverified.

wrapper_license: Apache-2.0 is a valid SPDX id, but the cube-harness repo LICENSE was not reachable to confirm. Upstream SWE-Gym benchmark_license reported Apache-2.0 with source_url to SWE-Gym/SWE-Gym/blob/main/LICENSE — plausible but not fetched/confirmed here.

id swegym already exists in the registry; this PR modifies an existing entry (own-entry ownership presumed verified by CI), so not an id-squat. name/description faithfully describe the real SWE-Gym (arXiv 2412.21139), no brand impersonation.

Verdict CONCERN driven by the pattern of unverifiable core claims: unpublished package + unreachable repo block confirmation of description, authorship, and license. Recommend re-review once swegym-cube is published to PyPI and the cube-harness wrapper lands on a public default branch.

@github-actions

Copy link
Copy Markdown

✅ ownership-check + quick-compliance passed (slow-compliance is informational).

Auto-merge skipped because:

  • LLM review flagged a CONCERN (see review comment above)

A maintainer will review and merge.

…ged)

The cube now lives on cube-harness dev (default branch); drop the dead
@feat/swegym-lite-cube ref. cube-standard rc9/rc10 are on PyPI so the pip-based
compliance check resolves the dependency cleanly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Cube Harness <cube-harness@example.com>
@recursix recursix merged commit eb50a37 into main Jun 18, 2026
8 of 10 checks passed
@recursix recursix deleted the add/swegym-entry branch June 18, 2026 19:32
@github-actions

Copy link
Copy Markdown

Entry review — swegym

Verdict: CONCERN

Check Result
description_matches_package ✅ pass
authors_consistent_with_git ⚠️ unverified
no_id_squat_vs_existing ✅ pass
no_brand_impersonation ✅ pass
wrapper_license_plausible ✅ pass

Notes:

Description: consistent with SWE-Gym (arxiv 2412.21139), a SWE-bench-methodology dataset of 2438 real-world Python SWE tasks across 11 repos with a 230-instance lite split. task_count 2438 matches. Docker-based fail_to_pass/pass_to_pass framing is correct for this benchmark family. Description plausible; cannot cross-check the swegym-cube subdir README directly but consistent with cube-harness repo structure.

ID squat: existing ids include swebench-live and swebench-verified; 'swegym' is a distinct benchmark (different dataset/authors than SWE-bench), not a confusable near-duplicate. Pass.

Brand impersonation: 'SWE-Gym' is the actual upstream name (SWE-Gym/SWE-Gym), faithful port — pass.

License: wrapper Apache-2.0 is a valid SPDX id; benchmark_license.reported Apache-2.0 with source_url to SWE-Gym/SWE-Gym/blob/main/LICENSE — plausible, verified_by_original_authors:false noted. Could not fetch the actual upstream LICENSE to confirm Apache-2.0, but claim is internally consistent. Pass on available evidence.

CONCERN drivers (not fails, but blocking):

  1. PyPI metadata for swegym-cube is EMPTY / package appears unpublished (dev_install_url is a git+ subdirectory URL, which may be acceptable, but no PyPI long description to cross-check description_matches_package against).
  2. authors_consistent_with_git UNVERIFIED: sole author 'recursix' (Alexandre Lacoste) is an AI Alliance / cube-harness maintainer, not a known SWE-Gym original author (known-authors.yaml lists none for this id). The linked repo README was readable but I could not inspect commit history for cubes/swegym-cube to confirm this handle contributed to that specific wrapper subdir. recursix is plausible as a cube-harness contributor (matches the harness authorship), so no impersonation evidence — but the original-author attribution for the SWE-Gym dataset itself is not represented.

Net: core claims look legitimate (faithful SWE-Gym port by a cube-harness maintainer), but empty PyPI page + inability to verify the wrapper author against the cube subdir commit history prevent a clean PASS. Recommend maintainer confirm (a) swegym-cube is actually published/installable and (b) recursix committed to cubes/swegym-cube.

@github-actions

Copy link
Copy Markdown

✅ ownership-check + quick-compliance passed (slow-compliance is informational).

Auto-merge skipped because:

  • LLM review flagged a CONCERN (see review comment above)

A maintainer will review and merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant