Skip to content

Fix TreeEnsemble target id validation#29293

Open
edgchen1 wants to merge 4 commits into
mainfrom
edgchen1/tree_ensemble_checks
Open

Fix TreeEnsemble target id validation#29293
edgchen1 wants to merge 4 commits into
mainfrom
edgchen1/tree_ensemble_checks

Conversation

@edgchen1

Copy link
Copy Markdown
Contributor

Fix TreeEnsemble target id validation

Problem

TreeEnsemble opset 5 normalizes leaf_targetids into the internal v3-shaped attributes without going through the v3 attribute constructor. Invalid target ids could therefore reach the N-output aggregators, where Min/Max indexed the predictions vector without a bounds check.

Fix

  • Centralize target/class id validation in TreeEnsembleCommon::Init() so all normalized entry paths are covered.
  • Reject non-positive target/class counts, negative target/class ids, and ids greater than or equal to the target/class count.
  • Add defense-in-depth target index checks in Sum/Min/Max N-output aggregators before indexing predictions.
  • Add v5 negative tests and update affected v3 regressor negative test expectations.

Testing

  • lintrunner on changed files
  • rebuilt onnxruntime_provider_test
  • targeted provider tests:
    • MLOpTest.TreeEnsembleFloat
    • MLOpTest.TreeEnsembleDouble
    • MLOpTest.TreeEnsembleSetMembership
    • MLOpTest.TreeEnsembleLeafOnly
    • MLOpTest.TreeEnsembleMinLeafTargetIdsOutsideBoundary
    • MLOpTest.TreeEnsembleMaxLeafTargetIdsOutsideBoundary
    • MLOpTest.TreeEnsembleNegativeLeafTargetIds
    • MLOpTest.TreeEnsembleZeroTargets
    • MLOpTest.TreeEnsembleLeafLike
    • MLOpTest.TreeEnsembleBigSet
    • MLOpTest.TreeEnsembleIssue25400
    • MLOpTest.TreeRegressorNegativeTargetIds
    • MLOpTest.TreeRegressorOutsideBoundaryTargetIds
    • MLOpTest.TreeEnsembleRegressorTargetIdsOutsideBoundary

edgchen1 and others added 4 commits June 26, 2026 12:01
Validate v5 leaf_targetids while converting to the internal v3 attribute representation so malformed models are rejected during initialization.

Add full-range aggregator target index checks for defense in depth and cover invalid v5 leaf targets with provider tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Share target id range validation between v3 attributes and v5 conversion.

Also store the checked target index once in the N-output aggregators before indexing predictions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a brief comment summarizing the target count and target id range invariants checked by ValidateTargetIds.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move target/class id range validation into the shared TreeEnsembleCommon::Init path so all normalized entry points are covered.

Use generic target/class id error messages and keep the v3 constructor's early positive-count check for base-value validation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a validation gap in the CPU TreeEnsemble implementation where opset-5 attribute normalization could bypass the v3 attribute constructor’s target-id checks, allowing invalid leaf_targetids to reach N-output aggregators and trigger out-of-bounds indexing. The fix centralizes target/class id validation in TreeEnsembleCommon::Init() and adds defense-in-depth bounds checks in the Sum/Min/Max N-output aggregators, along with updated and new negative tests.

Changes:

  • Centralized validation of target/class counts and ids in TreeEnsembleCommon::Init() to cover all attribute normalization paths (including opset 5).
  • Added explicit non-negative + in-range checks before indexing predictions in Sum/Min/Max N-output aggregators.
  • Added opset-5 negative tests for invalid leaf_targetids and updated expected error substrings in existing regressor negative tests.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
onnxruntime/test/providers/cpu/ml/treeregressor_test.cc Updates expected failure message substrings to match new centralized validation errors.
onnxruntime/test/providers/cpu/ml/tree_ensembler_test.cc Adds new opset-5 negative tests covering invalid leaf_targetids and zero target count.
onnxruntime/core/providers/cpu/ml/tree_ensemble_common.h Centralizes target/class count + id validation in TreeEnsembleCommon::Init() so normalized paths are validated.
onnxruntime/core/providers/cpu/ml/tree_ensemble_attribute.h Removes target/class id validation from the v3 attribute constructor (now handled in Init()); adjusts include ordering.
onnxruntime/core/providers/cpu/ml/tree_ensemble_aggregator.h Adds bounds checks (including negative-id protection) before indexing predictions in Sum/Min/Max N-output aggregators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants