Write multiple chunks concurrently, split by topic and/or schema

## Description

Write multiple chunks concurrently, split by topic and/or schema

### More efficient reading

Currently, chunks are created based on the size of uncompressed bytes.
This means that if you have two topics - `/large_topic` and `/small_topic` - both publishing at 10 Hz, each chunk will contain messages from both topics.

If you later want to read only `/small_topic`, you still need to decompress _all_ chunks, even though you only require a small subset of the data.

By splitting chunks by topic, we can place `/large_topic` and `/small_topic` into separate overlapping chunks.
This allows reading nodes to decompress only the chunks containing `/small_topic`, significantly improving read efficiency for selective topic access.

### More efficient writing

[Currently, chunks that fail to compress by at least 2 % are stored uncompressed.](https://github.com/foxglove/mcap/blob/c7ae597cf60e56ed8858fbae248f8ab59f185399/cpp/mcap/include/mcap/writer.inl#L746)
This often happens for data types that are already compressed - for example, `sensor_msgs/msg/CompressedImage`.

By splitting chunks by schema, we can separate messages like `sensor_msgs/msg/CompressedImage` into their own uncompressed chunks,
while grouping all other message types into compressed chunks.

This approach improves overall compression ratios for the compressed data,
while avoiding wasted CPU cycles trying to re-compress already-compressed payloads.

## Implementation Notes / Suggestions

- To prevent data loss a max duration per chunk group could be implemented, after which the chunk group is flushed to disk regardless of size.
- To prevent excessive RAM usage a max buffer size overall could be implemented, if exceeded the least recently used chunk group is flushed to disk.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Write multiple chunks concurrently, split by topic and/or schema #2220

Description

More efficient reading

More efficient writing

Implementation Notes / Suggestions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Write multiple chunks concurrently, split by topic and/or schema #2220

Description

Description

More efficient reading

More efficient writing

Implementation Notes / Suggestions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions