Skip to content

Stream reader with manual memory management? #100

Open
@ubruhin

Description

@ubruhin

Hi,
I really like how well this library is developed and might want to use it in a project. However, I'm not sure if the node reader API can be used the way I'd like. To my understanding, the API is able to:

  • let the user handle memory management: With mpack_tree_init_pool(), the library does not allocate any memory by itself. Both the input data and the nodes memory is allocated by the user.
  • parse a continuous stream: With mpack_tree_init_stream() I can use the library for processing a TCP stream of MessagePack data.

However, it seems that these two features cannot be combined. The stream reader always allocates its own buffer for the input data, and for the nodes. So I cannot manage the memory by myself. Especially for environments where memory fragmentation needs to be minimized, that's a real drawback.

Actually for me it would be good enough if I could simply allocate a large enough buffer for the data and for the nodes once, which shall then be used as-is without any reallocation (memory is available, but should not fragment too much during runtime). Probably this is (theoretically) possible even with mpack_tree_init_stream(), but there's still a problem: The documentation says that the reader cannot be reset if it is in error state. So in that case I have to destroy and create a new reader, which will also reallocate (and possibly fragment) the memory.

Is my understanding about that correct? Or is there a way to recover from a stream reader error without any memory reallocation?

By the way, somehow it is also a bit cumbersome that reading the data has to be implemented by a callback. I understand that for directly reading from a blocking TCP socket it works perfectly. But if the data is received by some other software component and then forwarded in blocks to some kind of "data processor", it is very cumbersome. It would be much simpler if I could allocate my own data buffer, and let the parser process it block by block.

Example (simplified):

char buffer[1024];
size_t bufferSize = 0;
mpack_tree_t tree;

void init() {
    // let the stream parser know the start address of the data
    mpack_tree_init_stream(&tree, &buffer[0], ...);
}

void processTcpData(const char* data, size_t len) {
    // append new data to existing data
    memcpy(&buffer[bufferSize], data, len);
    bufferSize += len;

    // continue parsing (only the new data)
    mpack_tree_add_data(&tree, len);

    // process all new objects from the start of the buffer
    while (mpack_tree_parse(&tree)) {
        mpack_node_t node = mpack_tree_root(&tree);
        processNode(node);
    }

    // remove the processed data from the buffer, move any additional data to the buffer start
    size_t processedBytes = mpack_tree_remove_processed_data(&tree);
    bufferSize -= processedBytes;
}

Maybe the example is not perfectly correct, but I hope you get the idea 🙂 Such an API would be very useful for me. Do you know if something like that is possible, either with mpack or with any other C/C++ library?

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions