Skip to content

pyrefly: cluster A cleanup (imports + attrs)#4810

Open
rjpower wants to merge 1 commit intopyrefly/pr2-cluster-dfrom
pyrefly/pr3-cluster-a
Open

pyrefly: cluster A cleanup (imports + attrs)#4810
rjpower wants to merge 1 commit intopyrefly/pr2-cluster-dfrom
pyrefly/pr3-cluster-a

Conversation

@rjpower
Copy link
Copy Markdown
Collaborator

@rjpower rjpower commented Apr 16, 2026

Summary

Cluster A of the pyrefly tightening plan (see .agents/projects/2026-04-15_pyrefly_tightening.md). Stacked on PR-2 (#4809).

search-path change

Added site-package-path = [".venv/lib/python3.11/site-packages"] to [tool.pyrefly]. Since the project already sets skip-interpreter-query = true, pyrefly previously had no way to see installed packages — every import numpy/requests/click/… was missing-import. This single change drops the unsuppressed missing-import count from 1,279 → 56.

jaxtyping stub

stubs/jaxtyping/__init__.pyi (plus "stubs" on search-path). Surface:

  • scalar aliases: Array, ArrayLike, DTypeLike, PRNGKeyArray, PyTree, Scalar
  • shape-generic classes (with __class_getitem__ -> Any): Float, Int, Bool, Shaped, Key, Complex, UInt, Num, Real, Inexact
  • concrete dtype aliases re-exported by haliax.haxtyping: Float16/32/64, BFloat16, Int4/8/16/32/64, UInt4/8/16/32/64, Complex64/128

Boundary suppression policy

missing-import stays globally disabled. pyproject.toml now carries a named list of vetted untyped libraries (jax, equinox, optax, ray, flax, chex, jmp, tensorstore, vllm, lm_eval, connectrpc, draccus, cloudpickle, humanfriendly, tqdm_loggable, dupekit, harbor, trackio, mergedeep, plus partial-stub libs fsspec/transformers/google.*) with a rationale comment explicitly marking this as a boundary choice, not a "we gave up" disable.

Real bugs fixed

  • lib/marin/src/marin/markdown/markdown.py:12from regex import regex (doesn't exist as an attribute on the regex package) → import regex. All callsites use regex.compile(...) which matches the fix.

Verified already-fixed by earlier PRs:

  • marin.infra.tpu_monitor dead imports (PR-0)
  • haliax.core.ensure_tuple typo (PR-0)
  • Stale # noqa: F821 on ControllerDB / EndpointRegistry (PR-0)
  • levanter/eval.py bare vocab/tag — these are jaxtyping shape strings (Int[Array, "vocab"]), not real bugs. They remain in the baseline.

Optional-narrowing sweep

Deferred to per-package follow-ups. The ~120 NoneType-has-no-attribute sites across marin/rl, iris/providers, fray/clusters each require reading the call site and adding an assert x is not None / cast(T, x) / signature refactor. That is 4-6 engineer-days of work per the cluster-A memo (logs/pyrefly-tightening/stage3-cluster-a.md §9). They land in the baseline now so new code cannot regress, and a follow-up PR can pick them up per-package with real review bandwidth.

Categories re-enabled in [tool.pyrefly.errors]

  • missing-attribute (was false) — residual 380 in baseline
  • unknown-name (was false) — residual 144 in baseline (mostly jaxtyping shape-string false positives)
  • missing-module-attribute — cleared to 0, category removed

missing-import stays disabled (documented boundary).

Baseline delta

PR-2 tip PR-3 tip
diagnostics 1 776
file lines 15 9327
disabled categories 8 6

The baseline grew because three previously silently-suppressed categories are now visible and file:line:col-scoped.

Test plan

  • ./infra/pre-commit.py --all-files passes
  • uvx pyrefly@0.61.0 check reports 0 errors (381 suppressed, 115 warnings not shown)
  • Freshly-regenerated baseline covers all residual diagnostics

🤖 Generated with Claude Code

- Add site-packages to pyrefly search path so numpy/pandas/requests/
  click/pydantic/transformers/… resolve from shipped py.typed metadata.
  Drops missing-import surface from 1,279 to 56.
- Ship project-local jaxtyping stub at stubs/jaxtyping/__init__.pyi
  covering Array/ArrayLike/DTypeLike/PRNGKeyArray/PyTree/Scalar plus
  shape-generic classes (Float/Int/Bool/Shaped/…) and the concrete
  dtype aliases re-exported by haliax.haxtyping (Float16/32/64,
  BFloat16, Int/UInt{4..64}, Complex64/128).
- Fix real bug: lib/marin/src/marin/markdown/markdown.py imported
  `regex` from itself instead of importing the module; all callsites
  use `regex.compile(...)` so `import regex` matches usage.
- Re-enable three pyrefly categories that were globally disabled:
  missing-attribute, unknown-name, missing-module-attribute. Residuals
  (primarily NoneType narrowing + Equinox Module field access + jaxtyping
  shape-string false positives) land in the baseline so new code can't
  regress.
- Keep missing-import disabled globally as a documented boundary choice:
  jax / equinox / optax / ray / flax / chex / jmp / vllm / lm_eval /
  connectrpc / draccus / cloudpickle / tensorstore / humanfriendly /
  tqdm_loggable / dupekit / harbor / trackio / mergedeep / fsspec /
  transformers / google.* ship without sufficient type info. Per-site
  ignores would be ~1.2k comments with zero additional signal.

Baseline: 15 → 9327 lines (1 → 776 diagnostics). The bulk comes from
un-suppressing missing-attribute (380) and unknown-name (144), both of
which were silently hidden before.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rjpower rjpower added the agent-generated Created by automation/agent label Apr 16, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 16, 2026

Claude finished @rjpower's task in 2m 22s —— View job


Code Review

  • PR is open, non-draft, needs review
  • Scope: pyproject.toml pyrefly config + stubs/jaxtyping/__init__.pyi + 1-line markdown.py import fix + regenerated baseline
  • Verified the real bug fix in lib/marin/src/marin/markdown/markdown.py:12import regex matches all call sites (regex.compile(...) at lines 41, 63, 76, 80)
  • Verified stubs/jaxtyping/__init__.pyi covers every name actually imported (Array, ArrayLike, Float, Int, Bool, Shaped, PRNGKeyArray, PyTree, Scalar, DTypeLike) — grep across lib/ shows no import not covered
  • Confirmed PyTree[X] usage (13 occurrences across haliax/levanter) still type-checks cleanly — no PyTree entries in the regenerated baseline
  • Sanity-checked the re-enabled error categories are documented in the comment block and that missing-import disable has a clear boundary-suppression rationale

Findings

No issues found. Checked for bugs and CLAUDE.md / AGENTS.md compliance.

Notes (not review findings, just observations for the author):

  • site-package-path = [".venv/lib/python3.11/site-packages"] hard-codes the minor version. Since pyproject.toml requires Python >=3.11, a dev on 3.12 would silently lose site-package resolution. Given missing-import is globally disabled, no new diagnostics would surface, so this is a soft issue — worth keeping in mind for the next pyrefly revisit.
  • The stub's PyTree = Any (rather than a _Shaped subclass) works because pyrefly accepts Any[X] as Any; confirmed via the regenerated baseline containing zero PyTree references.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-generated Created by automation/agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant