Skip to content

[PERFORMANCE]: Optimize stream parser buffer management (O(n²) → O(n)) #1613

@crivetimihai

Description

@crivetimihai

Summary

The stream parsers in mcpgateway/wrapper.py use string concatenation (+=) and slicing in tight loops, creating O(n²) time complexity. For large streaming responses with many chunks, this causes significant performance degradation.

Impact

  • CPU Overhead: Each buffer += text creates a new string object, copying all existing content
  • Memory Churn: Frequent allocations and deallocations increase GC pressure
  • Latency: Large SSE/NDJSON streams become progressively slower as buffer grows
Operation Current Optimized
Append chunk O(n) O(1) amortized
Slice buffer O(n) O(1)
Total per chunk O(n) O(1) amortized

Affected Code

File: mcpgateway/wrapper.py

ndjson_lines() (Lines 269-303)

buffer = ""
async for chunk in resp.aiter_bytes():
    buffer += text  # Line 291: O(n) - creates new string!
    while True:
        nl_idx = buffer.find("\n")  # O(n) search
        buffer = buffer[nl_idx + 1:]  # Line 297: O(n) - creates new string!

sse_events() (Lines 306-364)

Same pattern at lines 324, 330, 348.

Proposed Fix

Use io.StringIO or bytearray for O(1) amortized append:

import io

async def ndjson_lines(resp: httpx.Response) -> AsyncIterator[str]:
    buffer = io.StringIO()
    async for chunk in resp.aiter_bytes():
        buffer.write(decoder.decode(chunk))  # O(1) amortized
        # Process complete lines...

Or use split-based approach:

async def ndjson_lines(resp: httpx.Response) -> AsyncIterator[str]:
    partial_line = ""
    async for chunk in resp.aiter_bytes():
        lines = (partial_line + decoder.decode(chunk)).split('\n')
        partial_line = lines.pop()  # Keep incomplete line
        for line in lines:
            if line.strip():
                yield line.strip()

Acceptance Criteria

  • No string concatenation (+=) in buffer loops
  • No string slicing for buffer truncation
  • Memory usage bounded regardless of stream size
  • Existing SSE/NDJSON tests pass
  • Performance improvement measurable for large streams

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance related items

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions