Skip to content

[iris] Consolidate scheduling context and fix SQL cache misses#5776

Merged
rjpower merged 3 commits into
mainfrom
scheduler-context
May 15, 2026
Merged

[iris] Consolidate scheduling context and fix SQL cache misses#5776
rjpower merged 3 commits into
mainfrom
scheduler-context

Conversation

@rjpower
Copy link
Copy Markdown
Collaborator

@rjpower rjpower commented May 15, 2026

Profile-driven cleanup of the iris-controller scheduling loop. Merges SchedulingTickContext into SchedulingContext and deletes the create_scheduling_context wrapper layers (Controller and Scheduler methods, from_workers classmethod); Controller.last_scheduling_context now backs the dashboard diagnostics RPC. Converts 19 SQL call sites to expanding bindparam to reuse SQLAlchemy compiled statements. Drops zstd encoder level to -1. Consolidates _building_counts into reads.py with shared snapshots.

@rjpower rjpower requested a review from yonromai May 15, 2026 19:43
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 15, 2026

Claude finished @rjpower's task in 11m 13s —— View job


Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

Review process
  • Read skill instructions
  • Checked PR status (open, not draft)
  • Listed relevant CLAUDE.md/AGENTS.md files (root CLAUDE.md/AGENTS.md + lib/iris/AGENTS.md)
  • Summarized PR changes
  • Ran 4 parallel review agents (2x CLAUDE.md compliance, 2x bug scan)
  • Validated flagged issues — none survived validation
  • Posted summary

Both bug-scan agents found no bugs. The CLAUDE.md agents raised three candidates (in-place mutation of ctx in _run_scheduler_pass, a string forward reference list["PendingTask"], and UserBudgetDefaults | None typing). Validation found:

  • The ctx mutation is an intentional index-reuse optimization on a mutable per-tick scratchpad; consistent with existing _preference_pass mutations.
  • No CLAUDE.md/AGENTS.md rule prohibits string forward annotations.
  • The X | str rule targets isinstance-discriminated unions, not Optional. Production code path always populates the field before any non-None use.
· [Branch](https://github.com/marin-community/marin/tree/scheduler-context)

Copy link
Copy Markdown
Contributor

@yonromai yonromai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Approved. I did not find merge-blocking issues in the shared scheduler context cleanup.

Validation:

  • git diff --check origin/main...HEAD
  • uv run --directory lib/iris --group dev python -m pytest --tb=short tests/cluster/controller/test_preemption.py tests/cluster/controller/test_reservation.py tests/cluster/controller/test_scheduling_fairness.py tests/test_budget.py (141 passed)
  • uv run --directory lib/iris --group dev python -m pytest --tb=short tests/cluster/controller/test_scheduler.py -k 'soft_constraint or max_building or building_limit or diagnostics' (5 passed)
  • uv run --with pyrefly pyrefly check lib/iris/src/iris/cluster/controller/controller.py lib/iris/src/iris/cluster/controller/scheduler.py lib/iris/src/iris/cluster/controller/reads.py lib/iris/src/iris/cluster/controller/service.py lib/iris/src/iris/cluster/controller/transitions.py lib/iris/src/iris/cluster/types.py (0 errors)

Generated with Codex.

@rjpower rjpower force-pushed the scheduler-context branch from 86df161 to 5ceef81 Compare May 15, 2026 19:59
@rjpower rjpower changed the title Use a shared scheduler context. [iris] Consolidate scheduling context and fix SQL cache misses May 15, 2026
@rjpower rjpower added the agent-generated Created by automation/agent label May 15, 2026
@rjpower
Copy link
Copy Markdown
Collaborator Author

rjpower commented May 15, 2026

🤖 Specification (PR is over 500 LOC).

Problem and findings (py-spy of the running iris-controller, 10s):

  • _run_scheduling_loop 22.9% inclusive, mostly SQL. Per-tick: compute_user_spend ~0.32s, _get_running_tasks_with_band_and_value ~0.27s, resource_usage_by_worker ~0.21s (called 3x across state read / autoscaler demand / preemption paths), _pending_tasks_with_jobs ~0.16s, healthy_active_workers_with_attributes ~0.14s.
  • _run_polling_loop 16.8% inclusive. gen_cache_key 5.6% self from dynamic .in(list) clauses recompiling per call length.
  • controller-server thread 34.9% self in selector_events.write and 4.8% in zstd.compress.

Approach:

  1. Merge SchedulingTickContext + SchedulingContext into one type. Built once per tick by build_scheduling_context (free function, controller.py) inside one read_snapshot that fans out to all six heavy reads plus reserved-job ids. apply_scheduling_gates and compute_scheduling_order are pure free functions consuming the context; no DB.

  2. Delete the wrapper layers: Controller.create_scheduling_context, Scheduler.create_scheduling_context, SchedulingContext.from_workers, and the ControllerProtocol declaration. Every SchedulingContext field is now required (no Optional defaults, no field(default_factory=...) shims). Construction logic lives in post_init. frozen=True dropped — the dataclass was never actually frozen (assignment_counts, capacities, _soft_score_cache mutate during placement).

  3. Controller.last_scheduling_context property holds the post-tick snapshot. Dashboard diagnostics RPC reads from it instead of constructing a context on demand. Set at the end of _run_scheduling including both NO_PENDING_TASKS early-exit branches.

  4. SchedulingContext.evolve_with_workers(workers, jobs, building_counts, max_building_tasks) for the reservation-taint rebuild in _run_scheduler_pass. Reuses raw-read fields; only rebuilds the parts that depend on worker attributes (capacities, ConstraintIndex).

  5. SQLAlchemy compiled-statement cache: 19 call sites converted from .in_(list) to .in_(bindparam("name", expanding=True)). Statements that were rebuilt per call are hoisted to module-level constants (_TASK_SUMMARIES_FOR_JOBS_STMT, _PRIORITY_BANDS_STMT, _RUNNING_TASKS_BY_WORKER_STMT, BULK_GET_ATTEMPTS_STMT, BUILDING_COUNTS_STMT). For chunked composite-key queries like bulk_get_attempts, the statement is compiled once outside the chunk loop with tuple(...).in(bindparam("keys", expanding=True)).

  6. _building_counts and _building_counts_tx (duplicate variants on controller.py) collapsed into one reads.building_counts(tx, worker_ids). Two of the three callers previously opened separate snapshots for _building_counts and resource_usage_by_worker; they now share one snapshot.

  7. PendingTask and UserBudgetDefaults moved from controller.py / budget.py to iris.cluster.types so scheduler.py can type its fields without pulling DB.

  8. RPC compression: ZstdCompression(level=-1) ("fast" mode) — same wire format, lower encoder CPU.

Reservation gate cleanup that fell out of (1)-(2):

  • Pre-fetched reservation_entry_counts from job_config into the context so the gate decides reservation satisfaction without per-task DB lookups.
  • Deleted Controller._is_reservation_satisfied, _count_reservation_claims, _reservation_entry_count (only test callers; rewrote the three reservation tests to drive through apply_scheduling_gates).

Verification:

  • 1802 passed, 7 skipped under uv run pytest lib/iris/tests/cluster/ lib/iris/tests/test_budget.py -m 'not slow'.
  • ./infra/pre-commit.py --all-files passes clean.
  • grep -rn "create_scheduling_context|.from_workers" lib/iris/src lib/iris/tests returns zero matches in production code.

@rjpower rjpower merged commit 1bfeb6b into main May 15, 2026
14 checks passed
@rjpower rjpower deleted the scheduler-context branch May 15, 2026 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-generated Created by automation/agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants