Skip to content

[CPU EP] GatherND crashes with division by zero when batch dimensions mismatch between input and indices #23828

Open
@fdwr

Description

@fdwr

Issue description

Passing incompatible batch dimensions between the input and indices tensors (2 vs 3 in this example) should fail rather than crash.

gatherNdCrash.zip

{
    "op_type": "GatherND",
    "version": 12,
    "batch_dims": 1,
    "data": [[0,1,2],[10,11,12],[20,21,22]],
    "indices": [[1],[2]],
    "output": [1,7],
    "T": "float32",
}

Expected: Status failure
Actual: Fatal division by zero.

Image

Note passing 2 input batch dimensions work (as they both match), and passing 1 input batch dimension works too (ORT appears to either broadcast or clamp the input).

Stack:

>	onnxruntime.dll!onnxruntime::GatherNDBase::PrepareForCompute::__l2::<lambda>(__int64 slice_idx) Line 85	C++
 	onnxruntime.dll!onnxruntime::GatherNDBase::PrepareForCompute::__l2::<lambda>(__int64 first, __int64 last) Line 111	C++
 	onnxruntime.dll!std::invoke<void <lambda>(__int64, __int64) &,__int64,__int64>(onnxruntime::GatherNDBase::PrepareForCompute::__l2::void <lambda>(__int64, __int64) & _Obj, __int64 && _Arg1, __int64 && <_Args2_0>) Line 1601	C++
 	onnxruntime.dll!std::_Invoker_ret<void>::_Call<void <lambda>(__int64, __int64) &,__int64,__int64>(onnxruntime::GatherNDBase::PrepareForCompute::__l2::void <lambda>(__int64, __int64) & _Func, __int64 && <_Vals_0>, __int64 && <_Vals_1>) Line 661	C++
 	onnxruntime.dll!std::_Func_impl_no_alloc<void <lambda>(__int64, __int64),void,__int64,__int64>::_Do_call(__int64 && <_Args_0>, __int64 && <_Args_1>) Line 821	C++
 	onnxruntime.dll!std::_Func_class<void,__int64,__int64>::operator()(__int64 <_Args_0>, __int64 <_Args_1>) Line 862	C++
 	onnxruntime.dll!onnxruntime::concurrency::ThreadPool::ParallelFor(__int64 n, const onnxruntime::TensorOpCost & c, const std::function<void __cdecl(__int64,__int64)> & f) Line 622	C++
 	onnxruntime.dll!onnxruntime::concurrency::ThreadPool::TryParallelFor(onnxruntime::concurrency::ThreadPool * tp, __int64 total, const onnxruntime::TensorOpCost & cost_per_unit, const std::function<void __cdecl(__int64,__int64)> & fn) Line 704	C++
 	onnxruntime.dll!onnxruntime::concurrency::ThreadPool::TryParallelFor(onnxruntime::concurrency::ThreadPool * tp, __int64 total, double cost_per_unit, const std::function<void __cdecl(__int64,__int64)> & fn) Line 252	C++
 	onnxruntime.dll!onnxruntime::GatherNDBase::PrepareForCompute<__int64>(const onnxruntime::TensorShape & input_shape, const onnxruntime::Tensor * indices_tensor, const __int64 bytes_per_value, onnxruntime::GatherNDBase::Prepare & p, onnxruntime::concurrency::ThreadPool * tp) Line 106	C++
 	onnxruntime.dll!onnxruntime::GatherND::Compute(onnxruntime::OpKernelContext * context) Line 171	C++

To reproduce

onnxruntime_perf_test.exe -I -r 1 -e cpu gatherNdCrash.onnx

Urgency

Not blocking, but should add to ORT fuzzing test cases, as embedding an ONNX model in another document could crash the user process. Can validate untrusted input before passing to ORT backend.

Platform

Windows

OS Version

Windows 11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

e76bd2f

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions