About PUMLE

PUMLE (a quibble for "plume") is a project under the CO2SS Project by the TRIL Lab / CCS Team. Its primary goals are to:

Produce simulation data related to plume migration from numerical simulations generated by MRST software.
Feed physics-informed machine learning experiments with high-quality, consistent datasets.
Build an end-to-end ingestion/consumption data engineering pipeline for geological carbon storage applications in Brazilian reservoirs.

PUMLE consolidates simulation outputs, processes them into multidimensional (5D) arrays, and offers the ability to export data in various formats (NumPy, Zarr, MAT-files, CSV) as well as upload final results to cloud storage.

Process Overview

This flowchart summarizes PUMLE's purpose and high-level workflow:

flowchart TD
    n1(("Start")) --> n2("fa:fa-file Input simulation parameters")
    n2 --> n3["fa:fa-file-lines Process 'setup.ini'"]
    n3 --> n4["fa:fa-file-export Export Matlab structs"]
    n4 --> n5["fa:fa-file-import Load structs into m-file"]
    n5 --> n6["fa:fa-gear Run Matlab simulation"]
    n6 --> n7["fa:fa-database Store simulation results"]
    n7 --> n8{"API"}
    n8 -- CSE --> n9["fa:fa-arrow-up-right-dots Forward modeling"]
    n8 -- ML --> n10["fa:fa-brain Machine learning"]
    n9 --> n11["fa:fa-list-check Data quality assessment"]
    n10 --> n11
    n11 --> n12{"Consistency?"}
    n12 -- Not OK --> n2
    n12 -- OK --> n13(("End"))
    style n1 stroke:#00C853
    style n2 stroke:#2962FF
    style n3 stroke:#2962FF
    style n4 stroke:#2962FF
    style n5 stroke:#2962FF
    style n6 stroke:#2962FF
    style n7 stroke:#2962FF
    style n8 stroke:#FF6D00
    style n9 stroke:#2962FF
    style n10 stroke:#2962FF
    style n11 stroke:#2962FF
    style n12 stroke:#FF6D00
    style n13 stroke:#D50000

Features

Parameter Variation & Caching:
Generate multiple simulation parameter combinations and cache them to avoid redundant simulation runs.
Simulation Management:
Integrate with MRST software via MATLAB scripts to execute numerical simulations and process simulation outputs.
Data Consolidation:
Consolidate simulation outputs into multidimensional arrays, transform them into a unified “golden” dataset, and support different output formats (NumPy, Zarr, MAT-files, CSV).
Cloud Storage Integration:
Optionally upload consolidated outputs to cloud storage (e.g., Amazon S3) using built-in S3 upload functionality.
Metadata Handling:
Process, validate, and export simulation metadata (bronze, silver, and golden layers) using Pandas and Pandera.
Tabular Conversion:
Transform high-dimensional simulation data into tabular (CSV) format for further analysis or consumption by downstream applications.

Installation

PUMLE is organized as a Python package and is installable via pip. To install the package (once published on PyPI), run:

pip install pumle

Alternatively, if you are developing or using it locally, clone the repository and install with:

pip install .

Additionally, create a conda environment using the provided environment file:

conda env create -f environment.yml -n pumle-env
conda activate pumle-env

Usage

A typical workflow involves configuring the pipeline via a configuration dictionary or setup.ini file, then running the pipeline to process simulation parameters, execute simulations, consolidate results, and (optionally) upload to cloud storage.

Here’s an example script demonstrating usage:

python main.py

This example shows that after installation, a user simply imports the Pumle class from your package, configures it, and runs the pipeline. The caching in the parameter variation module ensures that simulations with previously run parameter combinations are skipped.

Development & Contributing

Project Structure

pumle_project/
├── setup.py
├── pyproject.toml       # Optional: for modern packaging standards
├── README.md
├── LICENSE
├── requirements.txt
├── MANIFEST.in          # Optional: include additional files
└── src/
    └── pumle/           # Your package code
        ├── __init__.py  # Contains __version__ and key imports
        ├── arrays.py
        ├── cloud_storage.py
        ├── ini.py
        ├── mat_files.py
        ├── metadata.py
        ├── parameters.py
        ├── parameters_variation.py
        ├── paths.py
        ├── sim_results_parser.py
        ├── tabular.py
        └── utils.py

License

PUMLE is released under the MIT License.

Acknowledgements

CO2SS Project – For inspiring the simulation use case.
TRIL Lab / CCS Team – For the foundational research and development.
Contributors: Gustavo Oliveira, Luiz Fernando Santos, Samuel Mendes

Remarks

Environment Configuration:
Modify the prefix in environment.yml as needed for your local setup.
Further Documentation:
Refer to the GLOSSARY.md for detailed descriptions of configuration parameters.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
benchmark/unisim-1-d		benchmark/unisim-1-d
notebooks		notebooks
simulation		simulation
src/pumle		src/pumle
.gitignore		.gitignore
Dockerfile		Dockerfile
GLOSSARY.md		GLOSSARY.md
LICENSE		LICENSE
README.md		README.md
classes.txt		classes.txt
environment.yml		environment.yml
flow.txt		flow.txt
main.py		main.py
requirements.txt		requirements.txt
setup.ini		setup.ini
simulation_script.sh		simulation_script.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About PUMLE

Process Overview

Features

Installation

Usage

Development & Contributing

Project Structure

License

Acknowledgements

Remarks

About

Uh oh!

Releases

Packages

Languages

License

luiz826/PUMLE

Folders and files

Latest commit

History

Repository files navigation

About PUMLE

Process Overview

Features

Installation

Usage

Development & Contributing

Project Structure

License

Acknowledgements

Remarks

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages