fix(te-plugin): handle TE 2.15+ tuple return from _Linear / _GroupedLinear#1481
Conversation
…edLinear` TE 2.15+ changed `_Linear.forward` and `_GroupedLinear.forward` to return `(out, new_workspace)` tuples instead of a single tensor. ModelOpt's patched `te_quantized_linear_fn` / `te_grouped_quantized_linear_fn` still passed the whole tuple into `self.output_quantizer`, crashing inside `TensorQuantizer.forward` on `tuple.numel()`: AttributeError: 'tuple' object has no attribute 'numel' Mirror the existing pattern from `_QuantTELayerNormLinear.forward`: quantize only `output[0]` (activation) and pass auxiliary workspace metadata through verbatim. TE <= 2.14 returns a single tensor and falls through the isinstance branch unchanged. Already landed on release/0.44.0 (c897fbe); this brings main in sync. Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
📝 WalkthroughWalkthroughThis PR updates two Transformer Engine quantized linear wrapper methods to support TE 2.15+ returning tuples. Both ChangesTransformer Engine 2.15+ Tuple Output Support
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1481 +/- ##
==========================================
+ Coverage 76.78% 76.86% +0.08%
==========================================
Files 473 473
Lines 51413 51417 +4
==========================================
+ Hits 39476 39524 +48
+ Misses 11937 11893 -44
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Type of change: Bug fix
TE 2.15+ changed
_Linear.forwardand_GroupedLinear.forwardto return(out, new_workspace)tuples instead of a single tensor. ModelOpt's patchedte_quantized_linear_fn/te_grouped_quantized_linear_fnstill piped the whole tuple intoself.output_quantizer, crashing insideTensorQuantizer.forwardontuple.numel():Mirror the existing pattern from
_QuantTELayerNormLinear.forward: when the underlying TE call returns a tuple, quantize onlyoutput[0](the activation tensor) and pass auxiliary workspace metadata through unchanged. TE <= 2.14 returns a single tensor and falls through theisinstancebranch identically to before this change.Already landed on
release/0.44.0as commitc897fbeaaf; this bringsmainin sync. Follow-up to #1473 (signature introspection +_forwardcache lookup), which fixed an earlier symptom of the same TE 2.15 signature change but not this tuple-return path.Usage
No public API change. PTQ continues to work transparently across TE 2.x:
Testing
Verified locally against both TE 2.12 and TE 2.15.0 using:
Without this fix on TE 2.15, the same test fails immediately with
AttributeError: 'tuple' object has no attribute 'numel'. With this fix, both versions exercise the same code paths and pass — TE <= 2.14 skips theisinstance(output, tuple)branch and behaves identically to before.Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: N/AAdditional Information
Triggered by Megatron-Bridge failing tests after their TE 2.15 bump. The
release/0.44.0cherry-pick was pushed directly (commitc897fbeaaf) so Bridge could unblock; this PR carries the same fix forward to main.