Skip to content

[Bug] Loop constant-folding crashes with "Evaluation failed on GatherND" when merged input is a Constant #35464

@goyaladitya05

Description

@goyaladitya05

OpenVINO Version

2026.1.0

Operating System

Windows System

Device used for inference

CPU

Framework

None

Model used

No response

Issue description

When a Loop op has a merged input fed from a Constant node, compile_model() throws during the constant-folding pass if the loop body contains a GatherND op:

Exception from src/core/src/model.cpp:549:
Evaluation failed on opset8::GatherND GatherND_122
  (opset1::Parameter Parameter_70[0]:f32[2,3,3], opset1::Concat Concat_119[0]:i32[2,3]) -> (f32[2])

The same graph compiles and runs correctly when the merged input comes from a Parameter instead of a Constant. The difference is that a Constant merged input triggers the constant-folding pass to call Loop::evaluate(), which evaluates the loop body via Model::evaluate(). GatherND has no evaluate() override in the core op (only in the template backend plugin), so the default Node::evaluate() returns false, which Model::evaluate() treats as fatal.

Step-by-step reproduction

import numpy as np
import openvino.opset15 as ov_opset
from openvino import Type, Model, compile_model

x = np.random.rand(2, 3, 3).astype(np.float32)
x = x @ x.transpose(0, 2, 1)

a_const = ov_opset.constant(x)          # <-- Constant triggers constant-fold
a_ov = a_const.output(0)

zero      = ov_opset.constant(0,  Type.i32).output(0)
one       = ov_opset.constant(1,  Type.i32).output(0)
minus_one = ov_opset.constant(-1, Type.i32).output(0)

a_shape    = ov_opset.shape_of(a_ov, Type.i32).output(0)
n          = ov_opset.gather(a_shape, minus_one, zero).output(0)
batch_prod = ov_opset.reduce_prod(
    ov_opset.slice(a_shape,
        ov_opset.constant([0], Type.i32), ov_opset.constant([-2], Type.i32),
        ov_opset.constant([1], Type.i32), ov_opset.constant([0], Type.i32)).output(0),
    zero, False).output(0)
flat_shape = ov_opset.concat([
    ov_opset.unsqueeze(batch_prod, zero).output(0),
    ov_opset.unsqueeze(n, zero).output(0),
    ov_opset.unsqueeze(n, zero).output(0),
], axis=0).output(0)
A_flat = ov_opset.reshape(a_ov, flat_shape, False).output(0)

eye_n  = ov_opset.one_hot(
    ov_opset.range(zero, n, one, output_type=Type.i32).output(0),
    n, ov_opset.constant(1.0, Type.f32), ov_opset.constant(0.0, Type.f32), axis=-1).output(0)
V_flat = ov_opset.broadcast(eye_n, flat_shape).output(0)

loop = ov_opset.loop(ov_opset.constant(1, Type.i32).output(0),
                     ov_opset.constant(True, Type.boolean).output(0))

A_param = ov_opset.parameter([2, 3, 3], Type.f32)
V_param = ov_opset.parameter([2, 3, 3], Type.f32)
A_curr  = A_param.output(0)

A_shape  = ov_opset.shape_of(A_curr, Type.i32).output(0)
l_batch  = ov_opset.gather(A_shape, zero, zero).output(0)
l_n      = ov_opset.gather(A_shape, minus_one, zero).output(0)

eye_i   = ov_opset.one_hot(
    ov_opset.range(zero, l_n, one, output_type=Type.i32).output(0),
    l_n, ov_opset.constant(1.0, Type.f32), ov_opset.constant(0.0, Type.f32), axis=-1).output(0)
mask_b  = ov_opset.broadcast(
    ov_opset.subtract(ov_opset.constant(1.0, Type.f32), eye_i).output(0), A_shape).output(0)
A_off_flat = ov_opset.reshape(
    ov_opset.abs(ov_opset.multiply(A_curr, mask_b)).output(0),
    ov_opset.concat([
        ov_opset.unsqueeze(l_batch, zero).output(0),
        ov_opset.unsqueeze(ov_opset.multiply(l_n, l_n), zero).output(0),
    ], axis=0).output(0), False).output(0)

topk   = ov_opset.topk(A_off_flat, ov_opset.constant(1, Type.i32), 1, 'max', 'value')
argmax = ov_opset.squeeze(topk.output(1), one).output(0)
p      = ov_opset.divide(argmax, l_n).output(0)
q      = ov_opset.mod(argmax, l_n).output(0)

b_u = ov_opset.unsqueeze(
    ov_opset.range(zero, l_batch, one, output_type=Type.i32).output(0), one).output(0)
p_u = ov_opset.unsqueeze(p, one).output(0)
q_u = ov_opset.unsqueeze(q, one).output(0)

pp_idx = ov_opset.concat([b_u, p_u, p_u], axis=1).output(0)
App    = ov_opset.gather_nd(A_curr, pp_idx).output(0)         # <-- fails here

A_next = ov_opset.add(A_curr, ov_opset.constant(0.0, Type.f32)).output(0)

body = Model([ov_opset.constant(True, Type.boolean).output(0), A_next, V_param.output(0)],
             [A_param, V_param], 'body')
loop.set_function(body)
loop.set_special_body_ports([-1, 0])
loop.set_merged_input(A_param, A_flat, A_next)
loop.set_merged_input(V_param, V_flat, V_param.output(0))
A_out = loop.get_iter_value(A_next)

model    = Model([A_out], [], 'test')
compiled = compile_model(model, 'CPU')   # <-- RuntimeError thrown here

Relevant log output

RuntimeError: Exception from src/core/src/model.cpp:549:
Evaluation failed on opset8::GatherND GatherND_122
  (opset1::Parameter Parameter_70[0]:f32[2,3,3], opset1::Concat Concat_119[0]:i32[2,3]) -> (f32[2])

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions