Skip to content

Improvements to send_dataframe API #8619

Open
@jleibs

Description

@jleibs

The send_dataframe API depends on arrow metadata tags to figure out the types of each columns.

However, this means adding another user-step of converting to a metadata-preserving format and then manually applying all of the correct tags.

We should try to reduce this friction where possible

Better timeline inference

Unlike arrow/polars/datafusion dataframes, which are pure tables with uniform columns, Pandas dataframes have a concrete index, which we could always map to a timeline of the corresponding name.

Entity/Component Tagging

As for the other columns, it would still be helpful to provide some way of informing Rerun of the entity/component for each columns. where augmenting the arrow metadata may be non-trivial.

This could maybe look something like:

df_components = [rr.Position3D, rr.Colors, "user.Confidence"]

rr.send_dataframe(df, components=df_components)

where the components arrray must match the number of columns in the dataframe.

Or maybe with an object-helper similar to AnyValues:

rr.send_dataframe(rr.TaggedDataframe(df, df_components))

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions