other(test): add unit/e2e tests for attention and triton_kernels #522
Draft
rebel-jinhwan wants to merge 1 commit intodevfrom
Draft
other(test): add unit/e2e tests for attention and triton_kernels #522rebel-jinhwan wants to merge 1 commit intodevfrom
rebel-jinhwan wants to merge 1 commit intodevfrom
Conversation
370652b to
3136223
Compare
…rage >98%) - Add unit tests for triton_kernels: registration, fake ops, wrappers (prefill + decode for all 5 kernel types) - Add unit tests for v1/attention/backends/flash_attention: custom op impls, backend class, metadata builder, forward dispatch routing, sinks routing - Add host-reference comparison tests parametrized by TP head configs (kv_heads=1/2/4, groups=4/2/1, head_dim=64/128) to validate attention correctness across different tensor parallel configurations - Add e2e compile tests (SDPA, masked, GQA, causal) that run on RBLN NPU via torch.compile(backend="rbln") and compare with host reference - Add edge case tests: multi-batch flash_causal decode, causal prefill mask skip, noncausal decode per-batch mask, missing batch_pad assertion, sliding_window batch_attn_opt int32 cast - Fix UnboundLocalError in forward() for causal+normal and non-causal+normal paths when VLLM_RBLN_COMPILE_MODEL=False (missing else branches) - Add pragma:no-cover to triton.jit kernels and warmup() (NPU-only code) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3136223 to
bc7cf09
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…rage >98%)
🚀 Summary of Changes
📌 Related Issues / Tickets
✅ Type of Change
release)feature)model)core)fix)perf)refactor)docs)other): please describe🧪 How to Test
.........📸 Screenshots / Logs (if applicable)
📋 Checklist
💬 Notes