You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
refactor: refactoring cuda code to cute-dsl (part 1) (flashinfer-ai#2428)
<!-- .github/pull_request_template.md -->
## π Description
We prioritize using dsl for kernel development over cuda for faster JIT
compilation speed.
This PR is the first series that refactors the simple normalization
kernels to cute-dsl.
CUDA code should be ready to remove after we finish end-to-end testing.
## π Related Issues
<!-- Link any related issues here -->
## π Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### β Pre-commit Checks
- [x] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [x] I have installed the hooks with `pre-commit install`.
- [x] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## π§ͺ Tests
- [x] Tests have been added or updated as needed.
- [x] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* CuTe-DSLβaccelerated normalization: RMSNorm (2D/3D), LayerNorm, fused
add+RMSNorm, and FP8-quantized variants exposed for runtime use.
* Shared norm utilities and JIT warmup to improve kernel readiness.
* **Chores**
* Runtime selection and fallback for CuTe-DSL/CUDA normalization with a
visibility check.
* **Bug Fixes**
* Safer optional-dependency handling to avoid hard failures when
CUDA/CuTe-DSL is unavailable.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
Co-authored-by: Brian Ryu <bryu@nvidia.com>
Signed-off-by: Amey Naik <212485788+ameynaik-hub@users.noreply.github.com>
0 commit comments