Skip to content

Conversation

@om4rrr
Copy link

@om4rrr om4rrr commented Jan 4, 2026

Details (Fix + Tests)

Fix

  • Fixes a segfault in PyOpenVINO when constructing openvino.op.Constant from NumPy arrays with string-like dtypes (dtype.kind in {'U','S','O'}).
  • Adds a safe string-handling path in src/bindings/python/src/pyopenvino/graph/ops/constant.cpp:
    • detect array.dtype.kind in U/S/O
    • reject shared_memory=True (not supported for string constants)
    • preserve shape from the NumPy array
    • flatten values via ravel().tolist() and convert to std::vector<std::string>
    • create ov::op::v0::Constant(ov::element::string, shape, values)
  • Keeps the existing fast path unchanged for non-string dtypes.

Tests

  • Reproducer (before: segfault; after: works):
    import numpy as np
    from openvino.op import Constant
    
    Constant(np.array(["a", "b"], dtype=np.str_))   # unicode ('U')
    Constant(np.array([b"a", b"b"], dtype="S1"))    # bytes ('S')
    Constant(np.array(["a", None], dtype=object))   # object ('O')

Copilot AI review requested due to automatic review settings January 4, 2026 15:15
@om4rrr om4rrr requested a review from a team as a code owner January 4, 2026 15:15
@github-actions github-actions bot added the category: Python API OpenVINO Python bindings label Jan 4, 2026
@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Jan 4, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a segfault in PyOpenVINO when creating openvino.op.Constant from NumPy string arrays by adding a dedicated handling path for string-like dtypes ('U', 'S', 'O') in the Constant constructor.

Key Changes

  • Adds string dtype detection based on dtype.kind before attempting to create Constants
  • Implements safe conversion path using NumPy's ravel/tolist to extract string values
  • Rejects shared_memory=True for string constants (not supported)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


std::vector<std::string> values;
values.reserve(static_cast<size_t>(py::len(list_values)));
for (py::handle item : list_values) {
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When casting None values from object arrays (dtype='O') to std::string, this will likely fail or produce unexpected behavior. The PR description mentions handling np.array(["a", None], dtype=object), but the code doesn't explicitly handle None/NULL values. Consider adding a check to either reject None values or convert them to empty strings with a clear error message.

Suggested change
for (py::handle item : list_values) {
for (py::handle item : list_values) {
if (item.is_none()) {
throw std::runtime_error(
"None values are not supported in string constants. "
"Please replace None with an empty string or a valid string value.");
}

Copilot uses AI. Check for mistakes.
Comment on lines +66 to +68
// dtype.kind: 'U' unicode, 'S' bytes, 'O' object
const std::string kind = array.attr("dtype").attr("kind").cast<std::string>();
const bool is_string_like = (kind == "U" || kind == "S" || kind == "O");
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dtype kind check only covers 'U', 'S', and 'O', but NumPy also has 'a' for byte strings (np.bytes_). Consider including 'a' in the string-like check to ensure comprehensive coverage of all string-related dtype kinds.

Suggested change
// dtype.kind: 'U' unicode, 'S' bytes, 'O' object
const std::string kind = array.attr("dtype").attr("kind").cast<std::string>();
const bool is_string_like = (kind == "U" || kind == "S" || kind == "O");
// dtype.kind: 'U' unicode, 'S'/'a' bytes, 'O' object
const std::string kind = array.attr("dtype").attr("kind").cast<std::string>();
const bool is_string_like = (kind == "U" || kind == "S" || kind == "a" || kind == "O");

Copilot uses AI. Check for mistakes.
}

// Convert to flat std::vector<std::string>
py::object np = py::module_::import("numpy");
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing numpy repeatedly inside the constructor can impact performance when creating many Constants. Consider caching the numpy module import at file/function scope or using py::module::import_static for better performance.

Suggested change
py::object np = py::module_::import("numpy");
static py::object np = py::module_::import("numpy");

Copilot uses AI. Check for mistakes.

if (is_string_like) {
if (shared_memory) {
throw std::runtime_error("shared_memory=True is not supported for string constants");
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message could be more informative by explaining why shared_memory is not supported for string constants and what the user should do. Consider updating to: "shared_memory=True is not supported for string constants (unicode, bytes, or object dtypes). String data must be copied. Please use shared_memory=False or omit the parameter."

Suggested change
throw std::runtime_error("shared_memory=True is not supported for string constants");
throw std::runtime_error(
"shared_memory=True is not supported for string constants (unicode, bytes, or object "
"dtypes). String data must be copied. Please use shared_memory=False or omit the "
"parameter.");

Copilot uses AI. Check for mistakes.
py::buffer_protocol());
constant.doc() = "openvino.op.Constant wraps ov::op::v0::Constant";
// Numpy-based constructor
// Numpy-based constructor (handles string dtype safely)
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment update is inconsistent with the actual implementation. The comment says "handles string dtype safely" but it's a constructor for all dtypes. Consider using a more accurate comment like "Numpy-based constructor with string dtype support" or moving the string-specific comment inside the string handling block.

Suggested change
// Numpy-based constructor (handles string dtype safely)
// Numpy-based constructor with string dtype support

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +99
// Non-string path: existing behavior
return std::make_shared<ov::op::v0::Constant>(
Common::object_from_data<ov::op::v0::Constant>(array, shared_memory));
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wrapping of object_from_data result in std::make_shared is inefficient. Common::object_from_data returns a Constant by value, which is then copy/move constructed into a shared_ptr. Consider directly returning the result without the extra std::make_shared wrapper, or better yet, wrap the return value in std::make_shared by moving: return std::make_shared<ov::op::v0::Constant>(std::move(Common::object_from_data<ov::op::v0::Constant>(array, shared_memory))). However, the simplest approach is to return it directly as was done originally.

Suggested change
// Non-string path: existing behavior
return std::make_shared<ov::op::v0::Constant>(
Common::object_from_data<ov::op::v0::Constant>(array, shared_memory));
// Non-string path: existing behavior, but move into make_shared to avoid an extra copy
return std::make_shared<ov::op::v0::Constant>(
std::move(Common::object_from_data<ov::op::v0::Constant>(array, shared_memory)));

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +95
if (is_string_like) {
if (shared_memory) {
throw std::runtime_error("shared_memory=True is not supported for string constants");
}

// Shape
ov::Shape shape;
shape.reserve(static_cast<size_t>(array.ndim()));
for (ssize_t i = 0; i < array.ndim(); ++i) {
shape.push_back(static_cast<size_t>(array.shape(i)));
}

// Convert to flat std::vector<std::string>
py::object np = py::module_::import("numpy");
py::object raveled = np.attr("ravel")(array);
py::list list_values = py::cast<py::list>(raveled.attr("tolist")());

std::vector<std::string> values;
values.reserve(static_cast<size_t>(py::len(list_values)));
for (py::handle item : list_values) {
// Accept python str/bytes and numpy scalar string types
values.push_back(py::cast<std::string>(item));
}

return std::make_shared<ov::op::v0::Constant>(ov::element::string, shape, values);
}
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new string constant functionality lacks test coverage in test_constant.py. While the PR description mentions it fixes a segfault, there should be tests verifying that creating Constants from np.str_, np.bytes_, and object dtype arrays now works correctly. Consider adding tests that verify shape preservation, value correctness, the shared_memory=True error case, and handling of various edge cases like empty arrays and None values in object arrays.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@almilosz almilosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, please add tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: Python API OpenVINO Python bindings ExternalPR External contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Good First Issue][Python API]: Create constant for string tensor and fix segfault

3 participants