Onnx rebase from main - Update SimOp Structure, Implement Missing ONNX Ops, and Fix Check-in Tests#245
Onnx rebase from main - Update SimOp Structure, Implement Missing ONNX Ops, and Fix Check-in Tests#245
Conversation
There was a problem hiding this comment.
Pull request overview
This PR rebases the ONNX operator implementation work onto the latest main branch and implements a comprehensive set of missing ONNX operators required for training and inference graphs. The changes update the SimOp infrastructure to support a new factory pattern, add over 15 new operator implementations, and resolve all functional and static analysis failures in the check-in test suite.
Key Changes
- SimOp Factory Refactor: Restructured SimOpFactory to explicitly map operator types to their classes and moved factory definition to resolve forward reference issues
- Missing Operators: Implemented 15+ operators including FastGelu, SoftPlus, HardSwish, Gemm, FusedMatMul, SoftmaxCrossEntropyLoss, and various reduction/gradient operators
- Test Infrastructure: Updated ~40 test files to use SimOpFactory instead of base class instantiation, added comprehensive test coverage for new operators
- Static Analysis Fixes: Enabled PEP 695 type alias support in MyPy configuration and fixed type inference issues across the codebase
Reviewed changes
Copilot reviewed 54 out of 62 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_ops/test_space_to_depth.py | New comprehensive test suite for SpaceToDepth operation |
| tests/test_ops/test_softsign.py | New comprehensive test suite for SoftSign activation function |
| tests/test_ops/test_softplus.py | New comprehensive test suite for SoftPlus activation function |
| tests/test_ops/test_skip_layer_normalization.py | New test suite for SkipLayerNormalization transformer operation |
| tests/test_ops/test_simplified_layer_normalization.py | New test suite for SimplifiedLayerNormalization operation |
| tests/test_ops/test_shrink.py | New test suite for Shrink activation function |
| tests/test_ops/test_selu.py | New test suite for SELU activation function |
| tests/test_ops/test_scatter_nd.py | New test suite for ScatterND operation with reduction modes |
| tests/test_ops/test_scatter_elements.py | New test suite for ScatterElements operation |
| tests/test_ops/test_rotary_position_embedding.py | New test suite for RoPE used in modern transformer architectures |
| tests/test_ops/test_rms_normalization.py | New test suite for RMSNormalization operation |
| tests/test_ops/test_reshape_ext.py | New test suite for extended reshape functionality |
| tests/test_ops/test_reductions.py | Updated to use SimOpFactory instead of base SimOp class |
| tests/test_ops/test_reduce_*.py | New test suites for ReduceProd, ReduceMin, ReduceMean operations |
| tests/test_ops/test_quantize_linear.py | New test suite for quantization operation |
| tests/test_ops/test_qlinear_matmul.py | New test suite for quantized matrix multiplication |
| tests/test_ops/test_qattention.py | New test suite for quantized attention operation |
| tests/test_ops/test_pad.py | New test suite for Pad operation |
| pyproject.toml | Updated MyPy configuration to support PEP 695 and reduce false positives |
| data/metal/inf/15oct25/*.yaml | Added new performance metrics for vision and NLP models |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ttsim/ops/op.py
Outdated
| clone = itensor | ||
| return clone | ||
|
|
||
| def build_tmp_data_tensor(data, name): |
There was a problem hiding this comment.
Why do we redefine it? It is already defined in ops/desc/helpers.py
|
@smistTT The operators in ops/op.py have been significantly refactored a month ago. When I rebased your earlier branch on main, that was the big change I had incorporated. In your current op.py is following the old structure, and needs to be revised significantly. I have marked this PR as draft. |
Is this PR still valid or we should close it? What is the value? Thanks |
|
Shalva needs this PR for BevFormer |
I suggest BevFormer PR be raised and that be rebased with main. The needed operators for BevFormer can added to that PR. |
|
I have updated the repo to leverage the new SimOp. |
Yes, whichever new ops you need, please add them. Good that you have updated the PR to use new SimOp structure. I will review it today. Can you squash to 1 commit? Thanks |
|
Rebase to main. Redo this PR since it is stale (defines |
d46fa00 to
a0b8cc2
Compare
ssapreTT
left a comment
There was a problem hiding this comment.
- There are conflicts. Please rebase and resolve the conflicts.
- Squash these into preferably a single commit; Or maybe 2 or 3.
a0b8cc2 to
e1a0aac
Compare
This commit squashes 21 commits to implement ~40 ONNX operators, add comprehensive tests, and refactor operation descriptors. Contributors: Samvit Kaul, Shreeniwas N Sapre, smistTT
e1a0aac to
a7353d7
Compare
|
@smistTT I compared the outputs of existing workloads before and after this change. I noticed that cycles change for Add, Hardswish and Matmul operators. From the changes you have made, are these changes expected? |
|
Hi @ssapreTT , I ran a few models comparing the main branch and this PR - here are the results -
The main reason for the different is -
Did you observe the same differences in perf projections ? |
Description
This PR updates the SimOp infrastructure to support the new factory pattern, implements a comprehensive set of missing ONNX operators required for training and inference graphs, and resolves all functional and static analysis (MyPy) failures in the check-in test suite. The branch has been rebased onto the latest main.
Key Changes
SimOpFactory Refactor: Restructured SimOpFactory to explicitly map operator types to their classes. Moved the factory definition to the end of the file to resolve forward reference NameError issues.
Helper Functions: Restored and integrated critical helper functions (check_io_counts, update_output_tensor, pooling_shape_inference, get_tensor_broadcast_shape).
Missing Operators Implemented: Added SimOp implementations for over 15 previously missing or commented-out operators, including:
Math/Activation: FastGelu, Softplus, HardSwish, Gemm, FusedMatMul.
Training/Loss: SoftmaxCrossEntropyLoss, InPlaceAccumulatorV2, ConcatTraining.
Reductions: ReduceSum, ReduceMean, ReduceProd, ReduceMin.
Gradients: DropoutGrad, FastGeluGrad, SoftmaxGrad_13, GatherGrad, SoftmaxCrossEntropyLossGrad.
Cleanup: Removed duplicate class definitions (SoftPlusOp, ReciprocalOp) and established proper aliases (e.g., GeluOp = ONNXGeluOp).
Import Fixes: Updated ~40 test files in tests/test_ops/ to correctly import SimOpFactory and operator classes.
Test Logic:
Refactored tests/test_ops/test_reductions.py to instantiate operators via SimOpFactory instead of the base class.
Corrected assertion messages in tests/test_ops/test_softplus.py.
Naming Conflicts: Renamed workloads/ttnn/llama3/test_attention.py → test_llama3_attention.py to fix pytest collection collisions.
MyPy Configuration: Updated pyproject.toml to:
Enable enable_incomplete_feature = ["NewGenericSyntax"] to support PEP 695 type aliases (fixing ~40 static analysis errors).
Disable warn_unreachable to reduce false positives in op.py.
Strict Typing Fixes:
tools/ttsi_corr/chart_builder.py: Added assertions to handle Optional types (chart.x_axis, etc.).
workloads/LeViT/LeViT.py: Suppressed invalid type inference on callable objects.
ttsim/ops/init.py: Exported SimOpFactory to the package level.
Validation
Functional Tests: checkin_tests.py all passed (including pytest suite and all simulation studies).
Static Analysis: checkin_tests.py static passed (MyPy).
Rebase: Verified clean rebase onto origin/main (commit 6212425)