Mixed data visualization #687

luisheb · 2025-05-12T10:51:42Z

Describe the proposed changes

Functionality to plot Mixed data objects represented as pandas DataFrame.

Checklist before requesting a review

I have performed a self-review of my code
The code conforms to the style used in this package
The code is fully documented and typed (type-checked with Mypy)
I have added thorough tests for the new/changed functionality

…/scikit-fda into feature/mixed_data_visualization

-Example added to path Note: notebook must be removed before merge

luisheb · 2025-06-16T18:37:25Z

I'm not entirely satisfied with how the graphs look in the final example, but the core plotting logic is complete and ready for review.

examples/representation/plot_mixed_data.py

skfda/exploratory/visualization/representation.py

examples/representation/plot_mixed_data.py

Co-authored-by: Carlos Ramos Carreño <[email protected]>

codecov · 2025-06-23T11:09:12Z

Codecov Report

Attention: Patch coverage is 16.23932% with 98 lines in your changes missing coverage. Please review.

Project coverage is 86.58%. Comparing base (e44d8e5) to head (a877ad4).
Report is 1 commits behind head on develop.

Files with missing lines	Patch %	Lines
skfda/exploratory/visualization/representation.py	16.23%	98 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #687      +/-   ##
===========================================
- Coverage    87.04%   86.58%   -0.47%     
===========================================
  Files          199      199              
  Lines        14639    14732      +93     
===========================================
+ Hits         12742    12755      +13     
- Misses        1897     1977      +80

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

luisheb · 2025-06-23T13:30:00Z

Any reference on how to write plot tests? Should I even do it? @vnmabus

vnmabus · 2025-06-23T15:47:46Z

Any reference on how to write plot tests? Should I even do it? @vnmabus

We do not have any infrastructure for that at the moment, unfortunately. So they are not mandatory.

examples/representation/plot_mixed_data.py

eliegoudout · 2025-07-03T20:18:15Z

Any reference on how to write plot tests? Should I even do it? @vnmabus

We do not have any infrastructure for that at the moment, unfortunately. So they are not mandatory.

My company will soon release a work project of mine, in which I developed an easy plot testing fixture. Would you be interested that I ping you here upon effective opensourcing?
I implemented a match_plots fixture which captures all calls to plt.show() and compares with expectations already saved. Updatig expectations occurs with pytest --update-plots.

vnmabus · 2025-07-03T21:25:02Z

My company will soon release a work project of mine, in which I developed an easy plot testing fixture. Would you be interested that I ping you here upon effective opensourcing? I implemented a match_plots fixture which captures all calls to plt.show() and compares with expectations already saved. Updatig expectations occurs with pytest --update-plots.

I planned to have a look to the tools that Matplotlib itself uses for testing. For example image_comparison looks similar to what you propose except that you have to manually decorate each test and move the reference images manually to a particular location. Your approach sounds more comfortable to use.

However, I am a bit worried about the impact of storing a lot of images in the git repository. I think it would be better if the images could somehow be stored in a separate location (maybe in the same place as the latest version of the documentation?). Then we would not need to store a lot of slightly different versions of images in the repository, caused by changes in the plotting code (images are much more expensive than text in terms of space).

In any way, please feel free to ping us, as it would be good to know all the alternatives that we have before committing to one.

eliegoudout · 2025-07-03T21:36:46Z

My fixture heavily rlies on matplotlib.testing indeed. I personally chose to store plots in /test/expected/ (so, next to test/project/) an indeed it adds up if you have too many plots. I will deinitely update you then!
@ego-thales

ego-thales · 2025-07-25T13:43:06Z

@vnmabus @luisheb
As promised, I come back to you now that my work project is opensourced 🎉

Here's a direct link to match_plots
https://github.com/ThalesGroup/scio/blob/837e5c328e2faa4c154bda1146a017cbfe654a11/test/conftest.py#L261C1-L277C8

The use is detailed in the docstring. It has two CLI options (disable or update) and one runtime ignore() method to disable locally.

What it does

It

monkeypatches plt.show to instead save figures with a normalized name (depending on the test and the number of plots found before in the test), e.g. blablabla.fig0.observed.png;
collects expected plots located saved test/expected/<test_name>/<normalized names>.png
compares all blablabla.fig0.observed.png with blablabla.fig0.png
Upon mismatch (within very small tolerance I found necessary despite every reproducibility hack), fails the test and also creates the blablabla.fig0.observed.diff.png (implemented by matplotlib directly). If --ignore-plots this is skipped, and if --update-plots, then the old blablabla.fig0.png is replaced and test is skipped with an update log.

Implementation

It is a bit "dirty" in some ways as I did not want to have an infinitely long conftest.py 'already long enough..), but should stilll be quite robust.

Notes

If you update multiple plots in the same test, the multiple pytest.skip will eventually fail the test (see Double pytest.skip leads to failure pytest-dev/pytest#13537), but this does not hurt logic, only report.
If you KeyboardInterrupt during the test, you might end up with an extra blablabla.fig0.observed.png file.
You can also check out the match_plots_torch fixture which I use to automatically ignore() unless dtype is torch.float and device == torch.device("cpu").
You can also check out analoguous match_outerr for stdout/stderr captures and match_arrays which works for numpy ad torch arrays, and can be extended to any class of arrays.
You can browse my test/expected folder to get an idea of what it concretely does. But in terms of file size, since test/ is not shipped with my releases, I do not care too much. Also, it is still quite reasonable.

Finally, I would like to say thanks for you work, since your repository was a good inspiration regarding my own developments. You might find a lot of similarities in the presentation of scio in general, it's not by chance!

All the best!

Luis Hebrero and others added 20 commits April 23, 2025 04:04

MixedDataPlot class completed

65c8c84

use case and color grouping

6ec6d7e

Docstring

33b6bdc

Doc correction: GAA-UAM#668 (comment)

c2c74dc

improved plot

e5255a9

representation doc

2c482b3

Merge branch 'GAA-UAM:develop' into feature/mixed_data_visualization

56642b4

Final corrections

fc05c66

Merge branch 'feature/mixed_data_visualization' of github.com:luisheb…

5b2eb0c

…/scikit-fda into feature/mixed_data_visualization

plot improvements

eddcde1

-Representation ruff error

e8d944d

-Example added to path Note: notebook must be removed before merge

example correction

fcd00b0

Final version approved

76d9004

resultados mejor

6260174

imporved figure size

59e5920

format corrected

40213af

final version

ee29895

Merge branch 'develop' into feature/mixed_data_visualization

8816426

Merge branch 'GAA-UAM:develop' into feature/mixed_data_visualization

84d98e3

Final version

4f1cecd

luisheb marked this pull request as ready for review June 16, 2025 18:37

tight_layout on plots

096d3f5

vnmabus requested changes Jun 22, 2025

View reviewed changes

luisheb and others added 5 commits June 23, 2025 12:21

Apply suggestions from code review

9f1d6c6

Co-authored-by: Carlos Ramos Carreño <[email protected]>

Corrections

f9967ba

Enumeration fix

d37bba8

strict correction

56ed5f3

writing correction

a84f863

vnmabus requested changes Jul 3, 2025

View reviewed changes

examples/representation/plot_mixed_data.py Outdated Show resolved Hide resolved

examples/representation/plot_mixed_data.py Outdated Show resolved Hide resolved

Luis Hebrero added 3 commits July 15, 2025 12:57

Documentation fix

a877ad4

Rerun documentation

3b4a1b3

Reference corrected

d4aa63f

Mixed data visualization #687

Are you sure you want to change the base?

Mixed data visualization #687

Uh oh!

Conversation

luisheb commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the proposed changes

Checklist before requesting a review

Uh oh!

luisheb commented Jun 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

luisheb commented Jun 23, 2025

Uh oh!

vnmabus commented Jun 23, 2025

Uh oh!

Uh oh!

Uh oh!

eliegoudout commented Jul 3, 2025

Uh oh!

vnmabus commented Jul 3, 2025

Uh oh!

eliegoudout commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ego-thales commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What it does

Implementation

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

luisheb commented May 12, 2025 •

edited

Loading

codecov bot commented Jun 23, 2025 •

edited

Loading

eliegoudout commented Jul 3, 2025 •

edited

Loading

ego-thales commented Jul 25, 2025 •

edited

Loading