Skip to content

perf: [durable_buffer] Avoid unnecessary copy of OTLP bytes in OtlpBytesAdapter::new() #2703

@AaronRM

Description

@AaronRM

Pre-filing checklist

  • I searched existing issues and didn't find a duplicate

Component(s)

Rust OTAP dataflow (rust/otap-dataflow/)

Description

In OtlpBytesAdapter::new(), BinaryArray::from_vec(vec![bytes.as_bytes()]) deep-copies the entire OTLP payload into a new Arrow allocation. The original OtlpProtoBytes (wrapping an Arc-backed bytes::Bytes) is also retained for NACK recovery, making this copy redundant.

Proposed Fix

Arrow's Buffer supports zero-copy wrapping of bytes::Bytes via impl From<bytes::Bytes> for Buffer. Construct the BinaryArray from a zero-copy Buffer instead:

let data_bytes: bytes::Bytes = bytes.as_bytes().clone(); // Arc refcount bump, no copy
let len = data_bytes.len();
let data_buffer = Buffer::from(data_bytes);              // zero-copy wrap
let offsets = Buffer::from_slice_ref([0i32, len as i32]);

let array_data = ArrayData::builder(DataType::Binary)
    .len(1)
    .add_buffer(offsets)
    .add_buffer(data_buffer)
    .build()?;

let binary_array = BinaryArray::from(array_data);

This is safe because Buffer::from(bytes::Bytes) takes ownership and stores the Bytes as an Arc<dyn Allocation>, keeping the data alive as long as the RecordBatch exists. The Bytes::clone() is just an atomic refcount bump.

Impact

Eliminates a full memcpy of every OTLP payload on the ingest hot path, reducing allocation pressure and CPU time for large messages.

Location

rust/otap-dataflow/crates/core-nodes/src/processors/durable_buffer_processor/bundle_adapter.rs, line 352.

Thanks to @utpilla for discovering this and reporting offline. 👍

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions