[jit_kernel] Add JIT eagle_utils kernel#19083
[jit_kernel] Add JIT eagle_utils kernel#19083Johnsonms wants to merge 7 commits intosgl-project:mainfrom
Conversation
…+ verify_tree_greedy) Port sgl-kernel/csrc/speculative/eagle_utils.cu to the JIT kernel framework: - csrc/speculative/eagle_utils.cuh: CUDA kernels using TVM FFI TensorView - eagle_utils.py: Python wrappers with register_custom_op - tests/test_eagle_utils.py: smoke, boundary, and JIT vs AOT cross-validation tests - benchmark/bench_eagle_utils.py: perf benchmarks for both kernels
Summary of ChangesHello @Johnsonms, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request integrates the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request successfully ports the eagle_utils kernels to the JIT framework, which is a valuable addition for portability and development speed. The implementation correctly adapts the existing AOT kernels. However, there are critical gaps in input validation within the C++ host functions and missing metadata in the Python custom op registration. Specifically, the tree_mask mutation is not declared in the Python wrapper, and the build_tree_kernel_efficient host function lacks the robust type, shape, and device checks present in its counterpart verify_tree_greedy.
Motivation
#17865
The eagle_utils kernels (build_tree_kernel_efficient and
verify_tree_greedy) are currently only available as AOT-compiled functions
in sgl-kernel. Porting them to the JIT kernel framework allows SGLang to
build and use these kernels without requiring a pre-compiled sgl-kernel
package, improving portability and enabling faster iteration during
development.
Modifications
sgl-kernel/csrc/speculative/eagle_utils.cu. Replaces PyTorch ATen tensor
types with tvm::ffi::TensorView, at::cuda::getCurrentCUDAStream() with
LaunchKernel::resolve_device(), and CHECK_* macros with RuntimeCheck. The
two CUDA device kernels (build_tree_efficient,
build_tree_efficient_partial_packed, VerifyTreeGreedy) are unchanged.
build_tree_kernel_efficient and verify_tree_greedy using
@register_custom_op, following the same pattern as speculative_sampling.py.
runs across batch sizes and tree configurations, accept-all / accept-none
boundary cases, and bitwise JIT vs AOT cross-validation for both kernels
(19 tests, all passing).
triton.testing.perf_report benchmarks for both kernels comparing JIT vs AOT
across typical EAGLE tree configurations and batch sizes.
Accuracy Tests
python -m pytest python/sglang/jit_kernel/tests/test_eagle_utils.py -vCorrectness

Benchmarking and Profiling
Performance Benchmark

Checklist
Review Process
/tag-run-ci-label,/rerun-failed-ci,/tag-and-rerun-ci