Skip to content

random access recording reader #7945

Open
@egorchakov

Description

@egorchakov

Is your feature request related to a problem? Please describe.
I'd like to be able to random-access RecordingView rows. Currently, RecordingView.select returns a streaming pyarrow.RecordBatchReader.

Describe the solution you'd like
Exposing a random access reader (e.g. pyarrow.RecordBatchFileWriter). Might be related to #6498?

Additional context
We recently introduced rrdsupport to rbyte to enable ML model training on .rrd files. During training, samples may be requested in random order. To enable random access for frame/image data, we currently store the selected columns in memory, which may become prohibitively expensive memory-wise. Ideally, we'd probably use a random access reader on a mmap'd recording or something along those lines.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestfeat-dataframe-apiEverything related to the dataframe API

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions