Skip to content

enh(skore): Implement option data_source="both" in PredictionError Display for the ComparisonReport #2156

@MarieSacksick

Description

@MarieSacksick

What would you like to say?

Part of #1874.

Frame

For the frame, let's add an extra column data_source, and concatenate all the data in a long and unique dataframe.

Plot

For the plot, to avoid having too many scatter dots on the same plot (here in the example only two models are compared, but we could easily have more), and to avoid comparing the train curve of a model with the test curve of another, which wouldn't make any sense, let's use the subploting option once it's implemented in #1445 to have one plot with all the train curves, and one plot with all the test curves.
Why not having one subplot per model with train and test at the same place? It's not necessarily a bad idea, and we could think about adding an option to offer both possibilities. Yet, to start simple and iterate, it seems that having one subplot per model is more useful when one wants to investigate one given model, and therefore it's the EstimatorReport job, rather than the ComparisonReport.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature 🎁Feature or enhancementready for dev 💻Issue specified enough and ready to be implemented

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions