Fix ch_axis bounds check in fused_moving_avg_obs_fake_quant_xpu by Copilot · Pull Request #3964 · intel/torch-xpu-ops

Copilot · 2026-06-11T01:59:14Z

fused_moving_avg_obs_fake_quant_xpu only validated the upper bound of ch_axis (ch_axis < x.dim()), so any negative value passed the check and was used as a raw index into a DimVector, causing an out-of-bounds access and segfault reachable from public Python APIs.

Added full range validation: ch_axis >= -x_dim && ch_axis < x_dim
Canonicalize negative values via dimension wrapping (e.g. -1 → last dim), matching standard PyTorch convention used throughout the codebase
All downstream uses of ch_axis in the function now use the wrapped value

const auto x_dim = x.dim();
TORCH_CHECK(
    ch_axis >= -x_dim && ch_axis < x_dim,
    "Error in fused_moving_avg_obs_fq_helper: ch_axis ",
    ch_axis,
    " is out of range for tensor with ",
    x_dim,
    " dimensions");
const auto wrapped_ch_axis =
    ch_axis < 0 ? ch_axis + x_dim : ch_axis;

Note: the upstream CUDA implementation (aten/src/ATen/native/quantized/cuda/FusedObsFakeQuant.cu) has the same bug.

chuanqi129 · 2026-06-11T02:08:12Z

@copilot please use pr-create skill to update the PR

chuanqi129 · 2026-06-11T02:11:34Z

@copilot The Python lint (flake8/ruff) and Clang format check failed. Please fix the lint errors in this PR.

Important: Do NOT update the plan in the PR description directly. Use reply comments to update the status.

Lint job log: https://github.com/intel/torch-xpu-ops/actions/runs/27319132554

Instructions:

Review the lint failure log via the link above
Fix all lint errors in the relevant source files
Ensure both Python lint and Clang format checks pass
Use reply comments to report progress - do NOT modify the PR description plan directly

Copilot · 2026-06-11T02:12:54Z

@copilot please use pr-create skill to update the PR

Updated the PR following the pr-creation skill:

Added reproducer test at test/repro/test_fused_obs_fake_quant_ch_axis_bounds.py covering out-of-range negative, valid negative (-1 wrapping), and out-of-range positive ch_axis values
Updated PR body with the required format including issue link, CUDA comparison notes, and Test: line

Commit: see latest push.

Copilot · 2026-06-11T02:15:23Z

@copilot The Python lint (flake8/ruff) and Clang format check failed. Please fix the lint errors in this PR.

**Important: Do NOT update the plan in the PR description directly. Use reply comments to ...

Fixed all lint errors:

CLANGFORMAT: Collapsed wrapped_ch_axis assignment to single line in FusedObsFakeQuant.cpp
TESTOWNERS: Added # Owner(s): ["module: intel"] header to test file
TEST_HAS_MAIN: Converted from bare pytest functions to TestCase class with run_tests() main block
PYFMT: Removed extra blank line between imports

See commit Fix lint: clang-format, test ownership header, and main block.

chuanqi129 · 2026-06-11T08:55:52Z

@copilot please review this PR in deep

Copilot · 2026-06-11T08:59:53Z

@copilot please review this PR in deep

PR Review: Fix ch_axis bounds check in fused_moving_avg_obs_fake_quant_xpu

Summary

Adds lower-bound validation and negative-index wrapping for ch_axis in fused_moving_avg_obs_fake_quant_xpu, preventing a segfault from out-of-bounds DimVector access reachable via public Python APIs. Overall this is a correct, well-scoped fix.

Correctness

maybe_wrap_dim vs manual wrapping — The standard PyTorch pattern for dimension canonicalization is at::maybe_wrap_dim(ch_axis, x.dim()), which handles range validation and wrapping in one call and is used extensively throughout the codebase (e.g., Sorting.cpp, Indexing.cpp, TensorTopK.cpp, SoftMaxKernels.cpp all in this repo). The manual TORCH_CHECK + ternary is functionally equivalent but diverges from the established convention. Consider replacing lines 42-50 with:

const auto wrapped_ch_axis = at::maybe_wrap_dim(ch_axis, x.dim());

This produces the same range check and wrapping behavior, provides a consistent error message format, and correctly handles 0-dim tensors (rejects all axes with wrap_scalar=false default). It also reduces the code delta. However, the current manual approach is not wrong — this is a "should fix" for consistency, not a correctness bug.

Upstream CUDA parity — Verified. The upstream CUDA implementation (aten/src/ATen/native/quantized/cuda/FusedObsFakeQuant.cu) has the identical incomplete check (ch_axis < x.dim() only, no lower bound, no wrapping). This PR intentionally improves on the upstream, which is appropriate for a security/crash fix. The PR description correctly notes this.

0-dim tensor edge case — The manual check creates an empty valid range [0, 0) for 0-dim tensors, correctly rejecting all ch_axis values. This matches maybe_wrap_dim(wrap_scalar=false) semantics. No issue here.

Testing

Missing negative wrapping equivalence assertion — test_negative_one_ch_axis_wraps only checks that ch_axis=-1 doesn't raise, but doesn't verify it produces the same result as ch_axis=2 (the last dim for a shape [2,3,4] tensor). Adding an assertEqual between the two outputs would confirm the wrapping logic is correct end-to-end, not just that it doesn't crash.

Missing boundary value -x.dim() test — The test covers -1 (valid) and -1250999896764 (invalid), but doesn't test ch_axis=-3 which is the minimum valid value for a 3-dim tensor. This boundary is where off-by-one errors in the range check would surface.

Backward Compatibility

The original code only accepted non-negative ch_axis < x.dim(). This PR now also accepts negative values via wrapping. This is strictly additive — previously valid inputs produce the same behavior, and previously crashing inputs now either work (valid negative) or raise a clear error (invalid negative). No BC concern.

Recommendation

Approve — The fix is correct, well-scoped, and addresses a real crash. The suggestion to use maybe_wrap_dim and the test coverage gaps are improvements but don't block merging.

chuanqi129 · 2026-06-11T12:07:47Z

@copilot New UT failures detected in basic tests (0 total). Please check these new failures and analyze whether they are caused by the PR changes.

Important: Do NOT update the plan in the PR description directly. Use reply comments to update the status.

Job log: https://github.com/intel/torch-xpu-ops/actions/runs/27335414788
Full new failure report: https://github.com/intel/torch-xpu-ops/actions/runs/27335414788/artifacts/7563662388

New failures:

Instructions:

Download the full new failure report artifact for complete error messages
Analyze whether the failures are caused by PR changes
If caused by PR changes, fix those failures on this PR branch directly
Add failing test cases with pytest commands in the PR description for the reproducer job to use
Use reply comments to report analysis status - do NOT modify the PR description plan directly

Copilot · 2026-06-11T12:09:17Z

@copilot New UT failures detected in basic tests (0 total). Please check these new failures and analyze whether they are caused by the PR changes.

**Important: Do NOT update the plan in the PR de...

Analyzed the CI failures:

Windows build — Build infrastructure failure (dependency resolution), not related to PR changes.
test_compare_cpu_where_xpu_bfloat16 — This tests the where operator with bfloat16 dtype. This PR only modifies src/ATen/native/quantized/FusedObsFakeQuant.cpp (bounds check for ch_axis) and adds a new test file. The failing test is completely unrelated to quantization or ch_axis. This is a pre-existing/flaky failure, not caused by this PR.

chuanqi129 · 2026-06-11T15:19:37Z

@copilot New UT failures detected in op_ut tests (2 total). Please check these new failures and analyze whether they are caused by the PR changes.

Important: Do NOT update the plan in the PR description directly. Use reply comments to update the status.

Job log: https://github.com/intel/torch-xpu-ops/actions/runs/27335414788
Full new failure report: https://github.com/intel/torch-xpu-ops/actions/runs/27335414788/artifacts/7568712799

New failures:

Category	Class name	Test name	Status	Message
op_ut	third_party.torch-xpu-ops.test.xpu.export.test_passes_xpu.TestPasses	test_views_op_having_view_copy	failed	AttributeError: '_OpNamespace' 'aten' object has no attribute 'cholesky'
op_ut	third_party.torch-xpu-ops.test.xpu.test_decomp_xpu.HasDecompTest	test_has_decomposition	failed	AttributeError: '_OpNamespace' 'aten' object has no attribute 'cholesky'

Instructions:

Download the full new failure report artifact for complete error messages
Analyze whether the failures are caused by PR changes
If caused by PR changes, fix those failures on this PR branch directly
Add failing test cases with pytest commands in the PR description for the reproducer job to use
Use reply comments to report analysis status - do NOT modify the PR description plan directly

Copilot · 2026-06-11T15:24:51Z

@copilot New UT failures detected in op_ut tests (2 total). Please check these new failures and analyze whether they are caused by the PR changes.

**Important: Do NOT update the plan in the PR de...

Both failures error with AttributeError: '_OpNamespace' 'aten' object has no attribute 'cholesky' — the cholesky op is missing from the ATen namespace. This PR only modifies src/ATen/native/quantized/FusedObsFakeQuant.cpp and adds a test file. These failures are unrelated to quantization or ch_axis bounds checking and are not caused by this PR.

chuanqi129 · 2026-06-12T05:00:42Z

@copilot New UT failures detected in basic tests (0 total). Please check these new failures and analyze whether they are caused by the PR changes.

Important: Do NOT update the plan in the PR description directly. Use reply comments to update the status.

Job log: https://github.com/intel/torch-xpu-ops/actions/runs/27392036929
Full new failure report: https://github.com/intel/torch-xpu-ops/actions/runs/27392036929/artifacts/7583477959

New failures:

Instructions:

Download the full new failure report artifact for complete error messages
Analyze whether the failures are caused by PR changes
If caused by PR changes, fix those failures on this PR branch directly
Add failing test cases with pytest commands in the PR description for the reproducer job to use
Use reply comments to report analysis status - do NOT modify the PR description plan directly

Copilot · 2026-06-12T05:01:48Z

@copilot New UT failures detected in basic tests (0 total). Please check these new failures and analyze whether they are caused by the PR changes.

**Important: Do NOT update the plan in the PR de...

The only failure is test_compare_cpu_where_xpu_bfloat16 (same as the previous run) — a pre-existing where operator test unrelated to this PR's quantization changes. No new failures caused by this PR.

chuanqi129 · 2026-06-12T11:34:47Z

@copilot New UT failures detected in op_ut tests (3 total). Please check these new failures and analyze whether they are caused by the PR changes.

Important: Do NOT update the plan in the PR description directly. Use reply comments to update the status.

Job log: https://github.com/intel/torch-xpu-ops/actions/runs/27392036929
Full new failure report: https://github.com/intel/torch-xpu-ops/actions/runs/27392036929/artifacts/7590411935

New failures:

Category	Class name	Test name	Status	Message
op_ut	third_party.torch-xpu-ops.test.xpu.dynamo.test_wrap_inductor_compiled_regions_xpu.TestWrapInductorCompiledRegions	test_wrap_with_cache	failed	OSError: [Errno 39] Directory not empty: '/tmp/torchinductor_root/fxgraph'
op_ut	third_party.torch-xpu-ops.test.xpu.dynamo.test_wrap_inductor_compiled_regions_xpu.TestWrapInductorCompiledRegions	test_wrap_config_affects_cache_key	failed	AssertionError: Scalars are not equal!
op_ut	third_party.torch-xpu-ops.test.xpu.export.test_retraceability_xpu.RetraceExportTestExport	test_dynamic_lstm_retraceability_strict	failed	torch._dynamo.exc.UserError: Constraints violated (L['x'].size()[0])! For more information, run with TORCH_LOGS="+dynamic".

Instructions:

Download the full new failure report artifact for complete error messages
Analyze whether the failures are caused by PR changes
If caused by PR changes, fix those failures on this PR branch directly
Add failing test cases with pytest commands in the PR description for the reproducer job to use
Use reply comments to report analysis status - do NOT modify the PR description plan directly

Copilot · 2026-06-12T12:15:53Z

@copilot New UT failures detected in op_ut tests (3 total). Please check these new failures and analyze whether they are caused by the PR changes.

**Important: Do NOT update the plan in the PR de...

All three failures are in dynamo/inductor/export tests unrelated to this PR:

test_wrap_with_cache — torchinductor cache directory cleanup issue (OSError: Directory not empty)
test_wrap_config_affects_cache_key — inductor cache key assertion failure
test_dynamic_lstm_retraceability_strict — dynamo constraint violation in export retracing

This PR only modifies src/ATen/native/quantized/FusedObsFakeQuant.cpp (bounds check) and adds a test file. None of these failures are caused by this PR.

Copilot AI assigned Copilot and chuanqi129 Jun 11, 2026

Copilot started work on behalf of chuanqi129 June 11, 2026 01:59 View session

Copilot AI linked an issue Jun 11, 2026 that may be closed by this pull request

Bound check in fused_moving_avg_obs_fake_quant_xpu #3963

Open

Copilot AI changed the title ~~[WIP] Fix bound check in fused_moving_avg_obs_fake_quant_xpu~~ Fix ch_axis bounds check in fused_moving_avg_obs_fake_quant_xpu Jun 11, 2026

Copilot AI requested a review from chuanqi129 June 11, 2026 02:03

Copilot finished work on behalf of chuanqi129 June 11, 2026 02:03

Copilot started work on behalf of chuanqi129 June 11, 2026 02:08 View session

chuanqi129 added the ai_generated label Jun 11, 2026

Copilot finished work on behalf of chuanqi129 June 11, 2026 02:13

Copilot started work on behalf of chuanqi129 June 11, 2026 02:13 View session

github-actions Bot added disable_e2e Disable all e2e test jobs for the PR disable_distributed Disable distributed UT test jobs for the PR labels Jun 11, 2026

Copilot finished work on behalf of chuanqi129 June 11, 2026 02:15

chuanqi129 marked this pull request as ready for review June 11, 2026 08:52

Copilot started work on behalf of chuanqi129 June 11, 2026 08:56 View session

Copilot finished work on behalf of chuanqi129 June 11, 2026 09:00

Copilot started work on behalf of chuanqi129 June 11, 2026 12:07 View session

Copilot finished work on behalf of chuanqi129 June 11, 2026 12:09

Copilot started work on behalf of chuanqi129 June 11, 2026 15:19 View session

Copilot finished work on behalf of chuanqi129 June 11, 2026 15:25

Copilot AI added 3 commits June 12, 2026 11:12

Fix ch_axis bounds check in fused_moving_avg_obs_fake_quant_xpu

ba468b8

Add reproducer test for ch_axis bounds check fix

9ea3522

Fix lint: clang-format, test ownership header, and main block

f9b8e71

chuanqi129 force-pushed the copilot/bound-check-fused-moving-avg branch from 3747880 to f9b8e71 Compare June 12, 2026 03:12

Copilot started work on behalf of chuanqi129 June 12, 2026 05:01 View session

Copilot finished work on behalf of chuanqi129 June 12, 2026 05:02

Copilot started work on behalf of chuanqi129 June 12, 2026 12:15 View session

Copilot finished work on behalf of chuanqi129 June 12, 2026 12:16

Conversation

Copilot AI commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chuanqi129 commented Jun 11, 2026

Uh oh!

chuanqi129 commented Jun 11, 2026

Uh oh!

Copilot AI commented Jun 11, 2026

Uh oh!

Copilot AI commented Jun 11, 2026

Uh oh!

chuanqi129 commented Jun 11, 2026

Uh oh!

Copilot AI commented Jun 11, 2026

PR Review: Fix ch_axis bounds check in fused_moving_avg_obs_fake_quant_xpu

Summary

Correctness

Testing

Backward Compatibility

Recommendation

Uh oh!

chuanqi129 commented Jun 11, 2026

Uh oh!

Copilot AI commented Jun 11, 2026

Uh oh!

chuanqi129 commented Jun 11, 2026

Uh oh!

Copilot AI commented Jun 11, 2026

Uh oh!

chuanqi129 commented Jun 12, 2026

Uh oh!

Copilot AI commented Jun 12, 2026

Uh oh!

chuanqi129 commented Jun 12, 2026

Uh oh!

Copilot AI commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jun 11, 2026 •

edited

Loading