Skip to content

Heap Out-of-Bounds Read in IR String Constant Deserialization#34489

Open
bumbosiepsak wants to merge 4 commits intoopenvinotoolkit:masterfrom
bumbosiepsak:CVS-182172-heap-out-of-bounds-read-in-ir-string-constant
Open

Heap Out-of-Bounds Read in IR String Constant Deserialization#34489
bumbosiepsak wants to merge 4 commits intoopenvinotoolkit:masterfrom
bumbosiepsak:CVS-182172-heap-out-of-bounds-read-in-ir-string-constant

Conversation

@bumbosiepsak
Copy link
Contributor

@bumbosiepsak bumbosiepsak commented Mar 4, 2026

  • Vulnerability fixed by validation of memory access in aux_unpack_string_tensor()

Tickets:

@bumbosiepsak bumbosiepsak requested a review from a team as a code owner March 4, 2026 12:05
@bumbosiepsak bumbosiepsak requested review from Copilot and removed request for a team March 4, 2026 12:05
@github-actions github-actions bot added the category: Core OpenVINO Core (aka ngraph) label Mar 4, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a heap out-of-bounds read vulnerability in OpenVINO IR packed string-constant deserialization by adding stronger validation of the packed buffer header and per-string offsets before constructing std::string instances.

Changes:

  • Validate num_strings is non-negative and ensure the computed header fits within the provided buffer.
  • Validate per-string begin/end offsets (non-negative, monotonic, and within the data region) before reading.
  • Use validated size_t offsets when constructing unpacked std::string values.

@praasz
Copy link
Contributor

praasz commented Mar 4, 2026

@bumbosiepsak
Please provide unit test

@praasz praasz self-assigned this Mar 4, 2026
@praasz praasz added the pr: needs tests PR needs tests updating label Mar 4, 2026
@bumbosiepsak bumbosiepsak force-pushed the CVS-182172-heap-out-of-bounds-read-in-ir-string-constant branch from 01760f4 to 8c65373 Compare March 9, 2026 12:11
// Calculate header size: [num_strings][0][offset1...offsetN]
const size_t num_elements = static_cast<size_t>(num_strings);
// Check addition overflow for header element count (2 + num_elements)
OPENVINO_ASSERT(num_elements <= std::numeric_limits<size_t>::max() - 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible if non-negative i32 is converted to the max value should be 1/2 or 1/4 of size_t

Copy link
Contributor Author

@bumbosiepsak bumbosiepsak Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand 😊 Could you elaborate a little, please?

// first run over all elements: calculate total memory required to hold all strings
header_size = sizeof(int32_t) * (1 + 1 + num_elements);
// Check addition overflow for header element count (1 + 1 + num_elements)
OPENVINO_ASSERT(num_elements <= std::numeric_limits<size_t>::max() - 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test for assertion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this check in favour of checking, that size_t is at least as big, as int32_t. (I hope we don't need to support 16-bit platforms...)

Then the positive range of size_t always sufficient then (twice as big, as int32_t).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 16-bit platforms are not a case for OV

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test can put into string_align_buffer_test.cpp file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this test file grew a little. Is it OK if I keep it separate?

I'll move content to string_align_buffer_test.cpp if you say so ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is better keep all of them in same file.

OPENVINO_ASSERT(int32_t(size) >= 4 + 4 + 4 * num_strings,
"Incorrect packed string tensor format: the packed string tensor must contain first "
"string offset and end indices");
OPENVINO_ASSERT(num_strings >= 0, "Incorrect packed string tensor format: negative number of strings");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is required to check?
if is negative it will be large number maybe updated check std::numeric_limits<size_t>::max() can detect it also?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sticked to the convention already established in this function, that the header consists of signed integers.

Question: can we imply, that they are unsigned (and switch to treating them as such everywhere)?

- Additional overflow protection added
- Variable usage streamlined with type unification
@bumbosiepsak bumbosiepsak force-pushed the CVS-182172-heap-out-of-bounds-read-in-ir-string-constant branch 2 times, most recently from e36bd58 to adeba47 Compare March 12, 2026 17:55
header_element_t* header = reinterpret_cast<header_element_t*>(data.get());
header[0] = static_cast<header_element_t>(strings_count);

if (strings_count > 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: what is the wanted buffer shape in case of zero strings provided?

Do we still want to create and fill the tail (i.e. begin and end of a string, that doesn't exist)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Constant has shape [2] and her string count is zero the constant will be not correct as shape and number of data not match.

It looks like some kind of error case

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

@bumbosiepsak bumbosiepsak force-pushed the CVS-182172-heap-out-of-bounds-read-in-ir-string-constant branch from adeba47 to 2f32e86 Compare March 12, 2026 18:35
@bumbosiepsak bumbosiepsak requested a review from Copilot March 12, 2026 18:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@bumbosiepsak bumbosiepsak enabled auto-merge March 13, 2026 15:54
data = std::shared_ptr<uint8_t>(new uint8_t[header_size], std::default_delete<uint8_t[]>());

header_element_t* header = reinterpret_cast<header_element_t*>(data.get());
header[0] = static_cast<header_element_t>(strings_count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both strings_count (from get_num_elements()) and current_string_end are size_t values cast to int32_t via static_cast without overflow validation. If either value exceeds INT32_MAX, this is signed integer overflow — undefined behavior in C++. Since this is a security-focused PR, the packing side should also be hardened:

OPENVINO_ASSERT(strings_count <= static_cast<size_t>(std::numeric_limits<header_element_t>::max()),
"Too many strings to pack: strings_count exceeds int32_t range");
Similarly, current_string_end should be validated before each cast:

OPENVINO_ASSERT(current_string_end <= static_cast<size_t>(std::numeric_limits<header_element_t>::max()),
"Cumulative string size exceeds int32_t range");

- Reworked according to review remarks
- Naming cleaned
- Checks made tighter
@bumbosiepsak bumbosiepsak force-pushed the CVS-182172-heap-out-of-bounds-read-in-ir-string-constant branch from 2f32e86 to 52cd61e Compare March 13, 2026 16:39
@mryzhov mryzhov requested a review from Copilot March 16, 2026 08:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

using header_element_t = int32_t; // Type of a single element in the header (strings_count and offsets)

static_assert(sizeof(header_element_t) <= sizeof(size_t),
"Header element type must be able to represent offsets and number of strings as size_t");

/// @brief Test case for string offset exceeding buffer bounds in packed string tensor.
/// Expecting AssertFailure with message about string offset exceeds buffer bounds.
/// num_strings = 1, header: [1, 0, end0=10], but buffer too small
"Incorrect packed string tensor format: negative string offset in the packed string tensor");

OPENVINO_ASSERT(begin_signed <= end_signed,
"Incorrect packed string tensor format: begin offset greater than end offset");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

                        "Incorrect packed string tensor format: begin offset greater than end offset");

begin offset greater than end offset this give no information as condition is serialized
Optional
Is always keep as number and message of assert as small as possible (but explain error) to not add which is not is usually used.


} // namespace

namespace ov {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
namespace ov {
namespace ov::test {

Comment on lines +10 to +12
using header_element_t = int32_t; // Type of a single element in the header (strings_count and offsets)

namespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move all inside ov::test

/// Expecting AssertFailure with message about begin offset greater than end offset.
/// num_strings = 2, header: [2, 0, end0=-3, end1=5]
TEST(StringUnpackTensorTest, NegativeOffsetsFails) {
using testing::HasSubstr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be used once in this file

Comment on lines +50 to +52
/// @brief Test case for missing number of strings in packed string tensor.
/// Expecting AssertFailure with message about missing strings count.
/// num_strings = <missing>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comment are not required

// first run over all elements: calculate total memory required to hold all strings
header_size = sizeof(int32_t) * (1 + 1 + num_elements);
// Check addition overflow for header element count (1 + 1 + num_elements)
OPENVINO_ASSERT(num_elements <= std::numeric_limits<size_t>::max() - 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 16-bit platforms are not a case for OV

@evkotov evkotov self-requested a review March 16, 2026 14:02
Copy link
Contributor

@evkotov evkotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: Core OpenVINO Core (aka ngraph) pr: needs tests PR needs tests updating

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants