Update tilelang by SigureMo · Pull Request #2 · PFCCLab/tilelang-paddle

SigureMo · 2025-12-05T04:57:29Z

No description provided.

* [Bugfix] Fix the jit_kernel issue * Update README.md --------- Co-authored-by: Lei Wang <34334180+LeiWang1999@users.noreply.github.com>

…ethod (#1359) This commit refines the Fragment creation process in the InferLayout method of ParallelOpNode. It removes the unnecessary forward_index array and utilizes default fragment indexing for consistency with other operations. Additionally, it binds the thread range to enhance comparability across different operations.

* [Analysis] Enhance NestedLoopChecker with tile op cases * fix tileop issue

* [misc] add a cpp side wrapper for gemm_sp_py * [misc] typing * [IR] bind GemmSPWarpPolicy * [chore] add wrapper code * [IR] fix GemmSPWarpPolicy * [codegen] apply ptxas instructions * [intrinsic] add typical (unused) mma layout * [template] add uint16 debug func * [intrinsic] add b matrix layout * [gemm_sp] enable fp16/bf16 on sm8x * [layout] refactor fp16/bf16 layout * [gemm_sp] enable int8 * [chore] update test case dtype * [gemm_sp] enable fp32 * [layout] refactor layouts * [intrinsic] enable ldmatrix for mat A * [layout] enable ldsm for matrix b * [layout] add ldmatrix for fp32 and fp8 * [chore] refine * [chore] refactor * [chore] add fp8 efactor * [chore] refactor * [chore] add remove negative zero util * [example] add a custom compress kernel * [chore] minor update * [test] refactor gemm_sp test * [refactor] make metadata layout func * [example] add option for using cutlass layout * [doc] add a gemm_sp doc * [doc] minor polish * [chore] remove unused * [bugfix] fix non replicate b case * [test] refactor * [chore] add a check * [bugfix] fix util bug * [wip] init a new test case for v2 * [chore] minor refactor * [chore] minor update * [bugfix] enable 16bit rs * [language] enable rs * [language] enable gemm_sp_sr * [language] enable gemm_sp_rr * [test] enable more tests * [tvm] update ffi binding * [chore] remove print * [chore] fix benchmark script * [lint] precommit lint * [chore] apply feedback * [test] use arch 8.0 * [chore] rollback ::ordered_metadata for backward compatibility * [bugfix] fix captialized * [example] keep gemm_sp on hopper * [test] fix no fp8 normal kernel * [test] reduce matmul size to satisfy accum error * [test] use cal_diff for assertion * [bugfix] expand float8 type * [lib] add make_int4 for short type * [language] add transpose E * [bugfix] fix wrong var * [format] format * [chore] refactor binding * [chore] fix wrong passing var

…1361)

…#1360) * [Enhancement] Implement dynamic unroll factor in CUDA code generation This commit introduces support for specifying a dynamic unroll factor in the CUDA code generation. The `unroll_factor` map is added to store unroll factors for loop variables, allowing for more flexible and optimized loop unrolling. Additionally, the `unroll` function is integrated into the loop language, enabling users to define unroll factors directly in their code. This enhancement improves performance by allowing tailored unrolling strategies based on specific loop characteristics. * lint fix * [Bugfix] Correct initialization of non-zero counters in custom compress kernel and update TIR registration for gemm_sp_py to use the correct tile operation

updates: - [github.com/pre-commit/mirrors-clang-format: v21.1.2 → v21.1.6](pre-commit/mirrors-clang-format@v21.1.2...v21.1.6) - [github.com/astral-sh/ruff-pre-commit: v0.14.3 → v0.14.7](astral-sh/ruff-pre-commit@v0.14.3...v0.14.7) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…ng and mapping (#1364)

…implify cached library path handling in sparse.py (#1365)

* [Refactor] Update condition for benchmarking in example_gemv.py and simplify cached library path handling in sparse.py * [Enhancement] Extend support for float8 data types in GEMM operations - Updated GEMM operations to recognize additional float8 data types: `float8_e4m3fn` and `float8_e5m2fnuz`. - Refactored condition checks in `checkWgmma` methods to simplify float8 type handling. - Adjusted test cases to ensure compatibility with the new float8 types in tile language examples. * lint fix

Copilot

Pull request overview

This PR updates the tilelang library with significant enhancements to sparse tensor operations, loop control features, and infrastructure improvements. Key changes include adding support for sparse GEMM operations (gemm_sp_v2), introducing unroll loop functionality, renaming tile operation namespaces for consistency, and improving cache control mechanisms.

Added comprehensive sparse tensor core support with gemm_sp_v2 API and custom compression utilities
Introduced T.unroll loop functionality with step and unroll_factor parameters
Renamed tile operation namespace from "tl." to "tl.tileop." for better organization
Enhanced cache control with separate environment variables for global and autotuning cache management

Reviewed changes

Copilot reviewed 70 out of 71 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tilelang/utils/tensor.py	Added float8 dtype detection and negative zero removal utilities
tilelang/utils/sparse.py	Enhanced compression support and added randint_semi_sparse generator
tilelang/tileop/gemm_sp/*.py	New sparse GEMM implementation with MMA support
tilelang/language/loop.py	Added unroll loop with step and factor support
tilelang/language/experimental/gemm_sp.py	Added gemm_sp_v2 API for sparse matrix multiplication
tilelang/layout/gemm_sp.py	Refactored metadata layout functions with clearer naming
tilelang/env.py	Enhanced cache control with separate disable flags
src/op/*.cc	Updated tile operation registration to use "tl.tileop" namespace
src/target/codegen_cuda.cc	Added pragma unroll factor code generation support
testing/python/*/.py	Added comprehensive test coverage for new features

Comments suppressed due to low confidence (3)

tilelang/language/loop.py:1

The error message contradicts the check condition. If unroll_factor is specified, pragma_unroll_explicit should be False (implicit unroll with factor), but the check raises an error when it's True. The message states "must be True" but should state "must be False".
tilelang/layout/gemm_sp.py:1
Variable name inconsistency: the code uses lowercase block_k but the function parameter is uppercase BlockK. This will cause a NameError at runtime.
tilelang/layout/gemm_sp.py:1
The function signature indicates it should return int, but line 132 returns a tuple (offset // k, offset % k). The return type should be tuple[int, int] to match the actual implementation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

gfvvz and others added 13 commits November 30, 2025 15:56

[Bugfix] Fix the jit_kernel issue (#1357)

c6a19fb

* [Bugfix] Fix the jit_kernel issue * Update README.md --------- Co-authored-by: Lei Wang <34334180+LeiWang1999@users.noreply.github.com>

[Analysis] Enhance NestedLoopChecker with tile op cases (#1358)

b10ef75

* [Analysis] Enhance NestedLoopChecker with tile op cases * fix tileop issue

[Bugfix] Update TIR registration for GemmSPPy to use tile operation (#…

e547d24

…1361)

[Bugfix] Remove debug print in PyStmtFunctionVisitor (#1363)

f951b92

[Debug] Always include line info in NVCC command for improved profili…

d88594a

…ng and mapping (#1364)

[Refactor] Update condition for benchmarking in example_gemv.py and s…

6501bd0

…implify cached library path handling in sparse.py (#1365)

[Enhancement] Add DISABLE_CACHE environment variables (#1368)

422fb12

[Refactor]: Remove useless include in atomicadd_vectorize.h (#1371)

1da3deb

Copilot AI review requested due to automatic review settings December 5, 2025 04:57

Copilot AI reviewed Dec 5, 2025

View reviewed changes

SigureMo merged commit 2a4bc34 into PFCCLab:paddle Dec 5, 2025
14 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update tilelang#2

Update tilelang#2
SigureMo merged 13 commits intoPFCCLab:paddlefrom
tile-ai:main

SigureMo commented Dec 5, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

SigureMo commented Dec 5, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants