Skip to content

[Bug]: rearrange_cache silently skips FP32 KV cache copy due to duplicate size == 2 condition #34668

@KarSri7694

Description

@KarSri7694

OpenVINO Version

2026.0.0

Operating System

Windows 11

Device used for inference

GPU

Framework

None

Model used

No response

Issue description

In multi_tensor_variable_state.cpp in intel_gpu plugin, the rearrange_cache function contains a duplicate condition that causes FP32 KV cache data to never be copied, silently producing garbage output.

Buggy code:

if (ov::element::Type(kv_layout.data_type).size() == 2)
    copy_element<uint16_t>(...);
else if (ov::element::Type(kv_layout.data_type).size() == 2)  // ← duplicate, never true
    copy_element<uint32_t>(...);

Expected: Second condition should be == 4 to handle 4-byte data like fp32 or int32

Impact: When running beam search on GPU with FP32 precision and calling get_state(), the output buffer is never written and contains uninitialized memory. This causes silently corrupted results with no error or warning.

Fix:

if (ov::element::Type(kv_layout.data_type).size() == 2)
    copy_element<uint16_t>(...);
else if (ov::element::Type(kv_layout.data_type).size() == 4)
    copy_element<uint32_t>(...);

If requested I can open a related PR.

Step-by-step reproduction

No response

Relevant log output

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions