Skip to content

Added simple pre and post processing functionality #361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 50 commits into
base: main
Choose a base branch
from

Conversation

HenningScheufler
Copy link
Collaborator

The goal is to have a initial implementation that serves as discussion platform for pre and post processing functionality

@codecov-commenter
Copy link

codecov-commenter commented Apr 30, 2025

Codecov Report

Attention: Patch coverage is 95.57823% with 13 lines in your changes missing coverage. Please review.

Project coverage is 89.19%. Comparing base (f3bf41c) to head (a8dde24).

Files with missing lines Patch % Lines
foamlib/postprocessing/table_reader.py 94.25% 5 Missing ⚠️
foamlib/postprocessing/load_tables.py 96.73% 3 Missing ⚠️
foamlib/preprocessing/system.py 72.72% 3 Missing ⚠️
foamlib/preprocessing/of_dict.py 91.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #361      +/-   ##
==========================================
+ Coverage   88.86%   89.19%   +0.33%     
==========================================
  Files          15       22       +7     
  Lines        1706     1999     +293     
==========================================
+ Hits         1516     1783     +267     
- Misses        190      216      +26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@HenningScheufler HenningScheufler added the enhancement New feature or request label Apr 30, 2025
@HenningScheufler
Copy link
Collaborator Author

HenningScheufler commented May 9, 2025

General concept:

The goal is to collect postprocessing data from parametric studies into a unified format.

With a given output_file (a lightweight abstraction over a file path), all cases are gathered into a single DataFrame:

    file = OutputFile(file_name="force.dat", folder="forces")
    file.add_time(0)

    table = load_tables(output_file=file, dir_name="tests/test_postprocessing/Cases")

To simplify usage further, it's also possible to list all output_files in a folder containing multiple cases:

     output_files = list_outputfiles("tests/test_postprocessing/Cases")

Planned features:

  • Add filtering capabilities, e.g., to extract the maximum surface elevation at a specific time step.
  • Enable looping over time directories.
  • Append case category metadata to the DataFrame, which requires reading metadata (e.g., JSON files generated during preprocessing).

Status

The implementation definitely needs to be polished but the question is how to represent the postprocessing functionality to the user:

  1. generate a python file from the CLI that contains the list post processing functionality (function-based)
    file = OutputFile(file_name="xxxxxxx", folder="xxxx")
    file.add_time(0)

    table = load_tables(output_file=file, dir_name="tests/test_postprocessing/Cases")
  1. provide a wrapped class that gathers all the results --> not sure how the to level code would look like

What interface would you choose or do you have a another idea?

@gerlero @greole @bevanwsjones

@bevanwsjones
Copy link

bevanwsjones commented May 9, 2025

Hey @HenningScheufler this is a great idea. A quick comment - probably a good idea to use multi-indexing on the DataFrame side for each case to make the data more 'manipulable'.

@gerlero
Copy link
Owner

gerlero commented May 9, 2025

What interface would you choose or do you have a another idea?

@HenningScheufler nice work so far. I think I have a better idea now on what you intended for the postprocessing module.

No hard opinion on the API right now, except that I'd make OutputFile (and likely also the load_tables function) take an argument named path instead of the current ones (to match how FoamCase and FoamFile work).

What I'm not that sold on is that it is necessary to add a dependency on pandas. Not saying pandas is too heavy a dependency, but given that the core functionality of the package doesn't require it, would it make sense to get rid of pandas dependency or make it optional? I'll hear you out.

@HenningScheufler
Copy link
Collaborator Author

This PR aims to rapidly develop a proof of concept that demonstrates the core idea: extending foamlib with additional functionality to simplify both preprocessing and postprocessing in OpenFOAM workflows.

To test the implementation, navigate to example/parametricStudy and run:

python createStudy.py 
python runStudy.py
python gatherResults.py 

This will generate a results/ folder containing three CSV files, each representing collected and combined results from the parametric study.

Next steps

  • Refactor preprocessing logic to replace the current minimal prototype (the hacky script)

Discussion

The general idea behind this PR is to provide a higher-level interface within foamlib that lowers the barrier of entry for new OpenFOAM users and helps them perform parametric studies more efficiently. This would make foamlib not just a utility library, but a framework that enables non-expert users to:

  • Easily define parameter sweeps

  • Automate case setup and execution

  • Collect and analyze results systematically

  • Quickly visualize outcomes through a dashboard or reporting tool

To support this, the library will likely need dedicated preprocessing and postprocessing modules, as well as optional visualization components. These additions should aim to reduce the complexity typically associated with OpenFOAM workflows, especially for researchers, students, or engineers unfamiliar with python

In this discussion, the goal is to focus on the direction and scope of the library:

  • Do we want to expand foamlib into a more user-facing toolkit?

  • Should it offer higher-level abstractions for study definitions and automation?

Technical details — including code structure, dependencies, and implementation specifics — are for now out of scope and could be tackled at a later stage

@HenningScheufler
Copy link
Collaborator Author

Hey @HenningScheufler this is a great idea. A quick comment - probably a good idea to use multi-indexing on the DataFrame side for each case to make the data more 'manipulable'.

The long format table format offers the possibility to combine tables with different sampling rate and is also the preferred table format for plotting (plotly express and seaborn). But it it possible to switch between the representation with pd.pivot and pd.melt

@HenningScheufler
Copy link
Collaborator Author

@gerlero i cannot reproduce why python 3.7 and 3.8 crashes. Do you have any ideas?

@gerlero
Copy link
Owner

gerlero commented May 27, 2025

@gerlero i cannot reproduce why python 3.7 and 3.8 crashes. Do you have any ideas?

Likely missing from __future__ import annotations at the top of the files that cause the errors

@HenningScheufler
Copy link
Collaborator Author

I wrote the documentation could you please review it?

@TessellateDataScience
Copy link

TessellateDataScience commented May 29, 2025

I'm happy to develop 'meshing.py' & 'dataAnalysis.py' for my particular lagrangian-focussed case. At this stage I'm thinking to abstract OF, foamlib, and these submodules (including pyVista) into a Docker image that computes a particular jet-like turbulent flow.

If you folks wanted I could keep a foamlib-focussed image (essentially OF v12 with cfMesh) at the same time?

@TessellateDataScience
Copy link

TessellateDataScience commented May 29, 2025

python runStudy.py

Thanks Henning! Was thinking, during runs some particular cases diverge (throw a error). If foamlib could automagically restart with modified numerical parameters (e.g. changing foam.fvSolution[...]) that would be helpful, no? Given I reactively check the solving process.

@HenningScheufler
Copy link
Collaborator Author

@gerlero i cannot reproduce why python 3.7 and 3.8 crashes. Do you have any ideas?

Likely missing from __future__ import annotations at the top of the files that cause the errors

older version of pydantic need typing.List

@HenningScheufler
Copy link
Collaborator Author

I'm happy to develop 'meshing.py' & 'dataAnalysis.py' for my particular lagrangian-focussed case. At this stage I'm thinking to abstract OF, foamlib, and these submodules (including pyVista) into a Docker image that computes a particular jet-like turbulent flow.

If you folks wanted I could keep a foamlib-focussed image (essentially OF v12 with cfMesh) at the same time?

Could you elaborate on this? i don't completely understand your goals. Maybe you can create a new issue where you describe your goals.

@HenningScheufler
Copy link
Collaborator Author

python runStudy.py

Thanks Henning! Was thinking, during runs some particular cases diverge (throw a error). If foamlib could automagically restart with modified numerical parameters (e.g. changing foam.fvSolution[...]) that would be helpful, no? Given I reactively check the solving process.

This will be quite challenging to implement, especially considering that OpenFOAM is typically run on large computing clusters using a job scheduling system. Moreover, simulations involving very large meshes can cost several thousand euros per run to build.

@HenningScheufler
Copy link
Collaborator Author

@gerlero ready to review

@gerlero
Copy link
Owner

gerlero commented May 30, 2025

@HenningScheufler Thanks! I’ll take a look over the next week.

@TessellateDataScience
Copy link

Could you elaborate on this? i don't completely understand your goals. Maybe you can create a new issue where you describe your goals.

I'm looking to reduce the setup needed for complete CFD novices to compute jet-like flows for health-related innovations (more information on our GitHub). Basically 'turnkey' app (currently in Docker) that allows higher horizontal scaling (following folding@home) using citizen science approach. If that's not clear enough let me know.

@TessellateDataScience
Copy link

This will be quite challenging to implement, especially considering that OpenFOAM is typically run on large computing clusters using a job scheduling system.

I'm hearing I'm probably an atypical user so thinking better to not listen to me. Thanks Henning.

@gerlero
Copy link
Owner

gerlero commented Jun 6, 2025

@HenningScheufler I've checked out the branch and built the new docs. The new functionality seems fine to me. Right now I'm not able to review everything more thoroughly, but I trust that you've taken the overall design into careful consideration (and that you're using it yourself and it provides good value); in any case, if there's any particular design decision you want to discuss, I'm happy to review that/those specific part(s) of the code.

  • Regarding the placement of the new modules: I see three scenarios for this:

    1. Keep them as they are (sort of what we've already decided). This to me means that the functionality in these modules is part of the package, but clearly separate from the core functionality and may require installing extra dependencies. Any changes in these modules should not break users except between major versions of foamlib
    2. Temporarily/permanently move the new modules to a contrib (or similar) sub-package, which would allow for independent development iteration, as the naming itself (at least to me) would clearly hint at the fact that these modules might not follow the versioning guarantees of the main package.
    3. For functionality that is not expected to change much and doesn't need extra dependencies, I'd actually welcome that as additions to the core functionality. E.g., I believe the "parametric study" functionality is a candidate for inclusion in the core library—though it's not a requirement for me.
  • Regarding new dependencies not needed for the core functionality: I'd prefer these to be listed as optional extras (e.g. pip install foamlib[postprocessing]). Then the module itself can raise an error asking the user to run that command when imported with missing dependencies.

  • Regarding the docs: docs should make it clearer which pages correspond to core functionality and which are submodules. In particular, the core functionality should go first, while the submodule pages should make it more obvious that they are independent modules that need to be imported separately.

Let me know if I'm missing something you want me to comment on (and thanks again for all the work).

@HenningScheufler
Copy link
Collaborator Author

@gerlero

Keep them as they are (sort of what we've already decided). This to me means that the functionality in these modules is part of the package, but clearly separate from the core functionality and may require installing extra dependencies. Any changes in these modules should not break users except between major versions of foamlib

Agreed, do you plan to add any major feature in the future (1.x.0) or are you mainly looking at fixed at the moment (1.0.x)

Temporarily/permanently move the new modules to a contrib (or similar) sub-package, which would allow for independent development iteration, as the naming itself (at least to me) would clearly hint at the fact that these modules might not follow the versioning guarantees of the main package.

This could be tackled later once the package gets larger but currently i would ship it as one package

For functionality that is not expected to change much and doesn't need extra dependencies, I'd actually welcome that as additions to the core functionality. E.g., I believe the "parametric study" functionality is a candidate for inclusion in the core library—though it's not a requirement for me.

The additional dependencies can be installed with (mame TBD)

pip install foamlib[parametric_sweep]

The documentation needs more work but it currently in a sufficient state (and should be tackled in a separate PR).

For the way forward, a dashboard and the CLI should be added.

What is the merge strategy?
Should we merge this feature and then CLI and the dashboard or do you prefer a different strategy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants