Fix TreeEnsemble target id validation#29293
Open
edgchen1 wants to merge 4 commits into
Open
Conversation
Validate v5 leaf_targetids while converting to the internal v3 attribute representation so malformed models are rejected during initialization. Add full-range aggregator target index checks for defense in depth and cover invalid v5 leaf targets with provider tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Share target id range validation between v3 attributes and v5 conversion. Also store the checked target index once in the N-output aggregators before indexing predictions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a brief comment summarizing the target count and target id range invariants checked by ValidateTargetIds. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move target/class id range validation into the shared TreeEnsembleCommon::Init path so all normalized entry points are covered. Use generic target/class id error messages and keep the v3 constructor's early positive-count check for base-value validation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a validation gap in the CPU TreeEnsemble implementation where opset-5 attribute normalization could bypass the v3 attribute constructor’s target-id checks, allowing invalid leaf_targetids to reach N-output aggregators and trigger out-of-bounds indexing. The fix centralizes target/class id validation in TreeEnsembleCommon::Init() and adds defense-in-depth bounds checks in the Sum/Min/Max N-output aggregators, along with updated and new negative tests.
Changes:
- Centralized validation of target/class counts and ids in
TreeEnsembleCommon::Init()to cover all attribute normalization paths (including opset 5). - Added explicit non-negative + in-range checks before indexing
predictionsin Sum/Min/Max N-output aggregators. - Added opset-5 negative tests for invalid
leaf_targetidsand updated expected error substrings in existing regressor negative tests.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| onnxruntime/test/providers/cpu/ml/treeregressor_test.cc | Updates expected failure message substrings to match new centralized validation errors. |
| onnxruntime/test/providers/cpu/ml/tree_ensembler_test.cc | Adds new opset-5 negative tests covering invalid leaf_targetids and zero target count. |
| onnxruntime/core/providers/cpu/ml/tree_ensemble_common.h | Centralizes target/class count + id validation in TreeEnsembleCommon::Init() so normalized paths are validated. |
| onnxruntime/core/providers/cpu/ml/tree_ensemble_attribute.h | Removes target/class id validation from the v3 attribute constructor (now handled in Init()); adjusts include ordering. |
| onnxruntime/core/providers/cpu/ml/tree_ensemble_aggregator.h | Adds bounds checks (including negative-id protection) before indexing predictions in Sum/Min/Max N-output aggregators. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix TreeEnsemble target id validation
Problem
TreeEnsembleopset 5 normalizesleaf_targetidsinto the internal v3-shaped attributes without going through the v3 attribute constructor. Invalid target ids could therefore reach the N-output aggregators, where Min/Max indexed thepredictionsvector without a bounds check.Fix
TreeEnsembleCommon::Init()so all normalized entry paths are covered.predictions.Testing
lintrunneron changed filesonnxruntime_provider_testMLOpTest.TreeEnsembleFloatMLOpTest.TreeEnsembleDoubleMLOpTest.TreeEnsembleSetMembershipMLOpTest.TreeEnsembleLeafOnlyMLOpTest.TreeEnsembleMinLeafTargetIdsOutsideBoundaryMLOpTest.TreeEnsembleMaxLeafTargetIdsOutsideBoundaryMLOpTest.TreeEnsembleNegativeLeafTargetIdsMLOpTest.TreeEnsembleZeroTargetsMLOpTest.TreeEnsembleLeafLikeMLOpTest.TreeEnsembleBigSetMLOpTest.TreeEnsembleIssue25400MLOpTest.TreeRegressorNegativeTargetIdsMLOpTest.TreeRegressorOutsideBoundaryTargetIdsMLOpTest.TreeEnsembleRegressorTargetIdsOutsideBoundary