Skip to content

Dataframe .select() API needs some way to error on missing columns #8463

Open
@jleibs

Description

@jleibs

When we select a column from the dataframe view, we previously made the decision that a missing column would provide results but be implicitly null.

The rationale for this decision was that in many situations, Rerun data producers may not even create a column if something is missing. For example: maybe a detector simply never logs anything if nothing is detected. In these cases, a query-writer reasonably wants a null column to exist.

However, the far, far far more common case is someone exploring the data and writing a test script made a typo. If a user doesn't actually know what was in the dataset, they may think this is validly null data, as opposed to realizing they aren't querying the right content.

We should instead default to error on null, and then provide a mechanism for advanced users to inject any missing-but-required columns into the view only in situations where it makes sense to do so.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feat-dataframe-apiEverything related to the dataframe API

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions