Skip to content

[evals] Add package metadata long-tail PPL slices#5124

Open
dlwh wants to merge 1 commit intomainfrom
claude/issue-5061-20260422-2219
Open

[evals] Add package metadata long-tail PPL slices#5124
dlwh wants to merge 1 commit intomainfrom
claude/issue-5061-20260422-2219

Conversation

@dlwh
Copy link
Copy Markdown
Member

@dlwh dlwh commented Apr 23, 2026

Expand the long-tail PPL registry with package registry, dependency, advisory, release, and lockfile metadata slices while leaving default eval sets unchanged. The entries are metadata-only and rely on caller-supplied raw_root paths, with deps.dev BigQuery ingestion deferred per issue guidance.

Fixes #5061

Cover the five surfaces called out in the issue DoD (registry JSON,
dependency-graph rows, advisory text, release metadata, lockfile-like
records) across PyPI / npm / crates.io / RubyGems / NuGet / Maven / Go
plus GHSA + OSV. deps.dev BigQuery ingest is deferred per reviewer
approval; entries stay metadata-only so no bulk HF mirroring or
cross-region transfers are triggered by this change.

Keeps these slices in the long-tail registry only; does not wire them
into default_raw_validation_sets.

Co-authored-by: David Hall <dlwh@users.noreply.github.com>
@dlwh dlwh added the agent-generated Created by automation/agent label Apr 23, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 23, 2026

Claude finished @dlwh's task in 2m 3s —— View job


Code review

No issues found. Checked for bugs and CLAUDE.md/AGENTS.md compliance.

Scope. Adds 11 new metadata-only PACKAGE_METADATA slices (registry JSON for PyPI/crates/rubygems/nuget, Maven POM, Go modules, GHSA/OSV advisories, npm + PyPI release metadata, The Stack v2 lockfiles, libraries.io dependencies), tightens two existing notes, and adds a DoD-surface coverage test.

What I checked.

  • All 16 PACKAGE_METADATA slice name= values are unique → registry_key collisions ruled out.
  • New entries all use PACKAGE_METADATA_ISSUE and raw_relative_path prefixed with packages/, matching the new test's assertions.
  • surface_form values cover the six DoD surfaces required by test_package_metadata_family_covers_dod_surfaces (registry_json, dependency_rows, dependency_edges, advisory_osv_json, release_metadata, lockfile).
  • Test imports (PACKAGE_METADATA_ISSUE, long_tail_ppl_slices) are exported from experiments/evals/long_tail_ppl.py.
  • No hardcoded gs:// paths — slices stay metadata-first with caller-supplied raw_root, aligned with experiments/AGENTS.md.
  • Default eval sets are untouched (no new wiring into any ferry/eval bundle); deps.dev BigQuery ingestion correctly deferred per issue guidance.
    · Branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-generated Created by automation/agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[evals] Add package and dependency metadata PPL slices

1 participant