Skip to content

Conversation

@avicizhu
Copy link

Summary:
Add TritonBatchedFusedEmbeddingBag, a Triton-based alternative to BatchedFusedEmbeddingBag that uses TritonTableBatchedEmbeddingBags instead of SplitTableBatchedEmbeddingBagsCodegen.

Key additions:

  • TritonBatchedFusedEmbeddingBag class implementing BaseBatchedEmbeddingBag with Triton TBE backend
  • TritonEmbeddingFusedOptimizer with shard-aware parameter keys to support column-wise sharding
  • Helper methods (split_embedding_weights, flush, reset_cache_states) added to Triton TBE for TorchRec compatibility
  • FUSED_TRITON compute kernel enum for planner integration
  • Comprehensive unit tests for the new classes

Differential Revision: D90527207

…to TorchRec

Summary:
Add TritonBatchedFusedEmbeddingBag, a Triton-based alternative to BatchedFusedEmbeddingBag that uses TritonTableBatchedEmbeddingBags instead of SplitTableBatchedEmbeddingBagsCodegen.

Key additions:
- TritonBatchedFusedEmbeddingBag class implementing BaseBatchedEmbeddingBag with Triton TBE backend
- TritonEmbeddingFusedOptimizer with shard-aware parameter keys to support column-wise sharding
- Helper methods (split_embedding_weights, flush, reset_cache_states) added to Triton TBE for TorchRec compatibility
- FUSED_TRITON compute kernel enum for planner integration
- Comprehensive unit tests for the new classes

Differential Revision: D90527207
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 31, 2026
@meta-codesync
Copy link
Contributor

meta-codesync bot commented Jan 31, 2026

@avicizhu has exported this pull request. If you are a Meta employee, you can view the originating Diff in D90527207.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants