Skip to content

feat(spec): SpecInput data contract + host/device boundary annotations#1064

Merged
pengchengneo merged 1 commit into
sgl-project:mainfrom
zorrofox:feat/p1-5a-spec-data-contract
May 14, 2026
Merged

feat(spec): SpecInput data contract + host/device boundary annotations#1064
pengchengneo merged 1 commit into
sgl-project:mainfrom
zorrofox:feat/p1-5a-spec-data-contract

Conversation

@zorrofox
Copy link
Copy Markdown
Contributor

Part of #1053 (P1-5a). Draft for early interface review — P1-2 (BaseSpecWorker/BaseDraftWorker class refactor) builds on this contract, so opening early per discussion.

Note

Depends on #1063 (P1-0). This branch is stacked on it; the actual P1-5a diff is the last commit only (4 files, +201/−26). Will rebase onto main once #1063 merges.

What

Defines the spec-decode data contract from the #1053 RFC tables so the P1-2 abstraction layer has a fixed interface to target:

  • spec_info.SpecInput@runtime_checkable Protocol exposing:
    • is_draft_input() / is_verify_input()
    • get_logical_token_num() / get_allocated_token_num() / get_verify_token_num() — the three token counts the RFC requires to be separated (scheduler advance / KV trimming / verify-metadata+DP-accounting respectively)
    • get_spec_adjust_token_coefficient() / filter_batch() / merge_batch()
    • Docstring fixes the host/device boundary rules and DP-padded layout (Route 1: target+draft both DP) semantics
  • EagleDraftInput — implements SpecInput; per-field docstrings annotate device vs host, shape, JIT-cache-key participation, and the allocate_lensaccept_lengthnew_seq_lens distinction. Multi-layer MTP cross-round hidden_states semantics noted.
  • EagleVerifyInput — implements SpecInput; per-field docstrings annotate device vs host and static-metadata fields; filter_batch/merge_batch raise (verify input is single-round).
  • test_spec_info.py — Protocol conformance + three-token-count distinction.

Non-changes

  • No behavior change to the existing dp=1 path (pure additions: docstrings, Protocol, new methods).
  • GenerationBatchResult already has all RFC-required fields except optional new_seq_lens (derivable from seq_lens + accept_lens); not adding it here.
  • Type annotations on device fields changed np.ndarrayjax.Array | None to reflect actual runtime types after fix(eagle): restore dp=1 dense EAGLE3 path on main + add E2E to CI #1063's reshard changes — annotations only, no code path change.

Open questions for P1-2 author

  1. Should SpecInput be a Protocol (current, duck-typed) or an ABC base class? Protocol avoids inheritance coupling but ABC lets BaseSpecWorker type-hint more strictly.
  2. get_logical_token_num currently returns a scalar int (sum across batch). Should it return per-request np.ndarray instead for DP per-rank accounting?
  3. EagleDraftInput.merge_batch currently uses np.concatenate (host-side). RFC says these are device arrays — should merge happen on device (jnp.concatenate)?

Test plan

@zorrofox zorrofox force-pushed the feat/p1-5a-spec-data-contract branch from d27b25a to 7a1e0ee Compare May 12, 2026 12:45
@zorrofox zorrofox mentioned this pull request May 12, 2026
11 tasks
Comment thread python/sgl_jax/srt/speculative/spec_info.py
Comment thread python/sgl_jax/srt/speculative/eagle_util.py
Comment thread python/sgl_jax/srt/speculative/spec_info.py Outdated
@zorrofox zorrofox force-pushed the feat/p1-5a-spec-data-contract branch from 7a1e0ee to 69113de Compare May 13, 2026 05:51
Part of sgl-project#1053 (P1-5a). Defines the spec-decode data contract that sgl-project#1053
P1-2 (BaseSpecWorker/BaseDraftWorker class refactor) will build on, so the
abstraction layer has a fixed interface to target.

- spec_info.SpecInput: runtime_checkable Protocol exposing the three
  token counts the RFC requires to be separated (logical / allocated /
  verify) plus is_draft_input/is_verify_input/get_spec_adjust_token_coefficient
  /filter_batch/merge_batch. Docstring fixes the host/device boundary and
  DP-padded layout (Route 1) semantics.
- EagleDraftInput: implement SpecInput; per-field docstrings annotate
  device vs host, shape, JIT-cache-key participation, and the
  allocate_lens vs accept_length vs new_seq_lens distinction.
- EagleVerifyInput: implement SpecInput; per-field docstrings annotate
  device vs host and static-metadata fields; filter/merge raise (verify
  input is single-round).
- test_spec_info.py: Protocol conformance + three-token-count distinction.

No behavior change to the existing dp=1 path (pure additions: docstrings,
Protocol, new methods). Type annotations on device fields changed
np.ndarray → jax.Array | None to reflect actual runtime types after sgl-project#1063.

Depends on sgl-project#1063 (P1-0) — diff is on top of that branch.
@zorrofox zorrofox marked this pull request as ready for review May 14, 2026 06:46
@zorrofox zorrofox force-pushed the feat/p1-5a-spec-data-contract branch from 69113de to ba650a3 Compare May 14, 2026 06:46
@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@pengchengneo
Copy link
Copy Markdown
Collaborator

LGTM

@pengchengneo pengchengneo merged commit f4d39f2 into sgl-project:main May 14, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants