GitHub - chipnbits/flowtrain_stochastic_interpolation: Generative AI for geogen data with stochastic interpolation

📦 Installation

This repository includes the flowtrain package for stochastic interpolation and managing machine learning models. The source code is located in src/flowtrain.

Make sure that Python 3.12 is installed first through an environment manager, e.g. conda create -n flowtrain-test python=3.12 -y.To install the package in editable mode, navigate to the project root (where setup.py is located) and run:

    pip install -e .

This package also installs a dependency for synthetic geological data generation:
StructuralGeo, which will be installed automatically via pip.

StructuralGeo Integration

The project/ directory contains code and supporting files for training and evaluating flow-based models on 3D StructuralGeo data. These models are designed for stochastic interpolation using flow-matching techniques.

The codebase is built on:

StructuralGeo for synthetic geological data generation GitHub
PyTorch for deep learning
PyTorch Lightning for cleaner training loops
Weights & Biases (wandb) for experiment tracking

Pretrained Models

Pretrained models for both unconditional and conditional generation at 64³ resolution are available. These weights are downloaded automatically if not found on tge first use and stored in the project/*/demo_model/ directory.

If desired, the weights can also be downloaded manually from the v1.0.0 GitHub release:

conditional-weights.ckpt
unconditional-weights.ckpt with training run WandB

Model Configuration (Summary)

Base channels: 48
Channel multipliers: (1, 2, 2, 3, 4)
Time embeddings: Learned Fourier (1024 dim, bandwidth 1000)
Attention: Enabled at all scales, 4 heads, dim_head = 32
Conditioning: ATb embedding with ATb mixing at every resolution
Training: LR = 1e-3, EMA = 0.9995, t ∈ [1e-4, 0.9999], batch=8

Usage

Unconditional Model

Training: project/geodata-3d-unconditional/train_unconditional.py Training parameters can be edited via the get_config() function in the script, currently set to values used in training the saved demo model. To train on multiple GPUs, use the --train-devices flag.

cd project/geodata-3d-unconditional

python model_train_inference.py --mode train --train-devices 0,1

Inference demo: Use the main() function in the same script to run inference with pretrained weights. Optional flags include:
- --mode: Set to train, inference, or both (default: inference)
- --n-samples: Number of samples to generate (default: 8)
- --batch-size: Batch size for inference (default: 1)
- --seed: Random seed for reproducibility (default: 100)
- --no-save-images: Disable saving visualization images (default: save images)
- --infer-device: Device for inference, e.g., cuda or cpu (default: cpu)
- --checkpoint_path: Path to custom checkpoint file to override pretrained weights (default: use pretrained weights) The pretrained model will be automatically downloaded if not found locally. Note that the pretrained weights are setup to load automatically, custom training checkpoint loading is available with the --checkpoint_path flag.

cd project/geodata-3d-unconditional

# Saves tensors + PNGs to project/samples/<project_name>/
python model_train_inference.py --mode inference --n-samples 8 --batch-size 2 --seed 100 --infer-device cuda

Conditional Training & Inference

Conditional training and inference requires an additional step to set up the surface and borehole data from a random generated StructuralGeo streaming data sample.

Training: model_train_sh_inference_cond.py

Training parameters can be adjusted via the get_config() function inside of the script. Script is set to use the same set of hyper parameters that were used for the pretrained conditional model provided.

cd project/geodata-3d-conditional

python model_train_sh_inference_cond.py

Inference: A Jupyter notebook project/geodata-3d-conditional/inference_demo.ipynb is provided to demonstrate generating conditional data, loading the saved weights, and running inference with the pretrained model. An additional probabilistic analysis using an ensemble of models is also included, making use of compressed data in the dikes_ptpack.tar.gz archive.

An automated python script has also been provided to automatically generate synthetic geology, extract borehold data, and produce reconstructions:

cd project/geodata-3d-conditional

python model_inference_experiments.py --n-samples 4 --n-scenarios 1

Available flags include:

--device: Device for inference, e.g., cuda or cpu (default: cpu)
--n-samples: Number of samples to generate per scenario (default: 1)
--n-scenarios: Number of different geological scenarios to generate for sample reconstruction (default: 4)
--use-ema: Use EMA weights for inference (default: True)
--no-display: Disable displaying images during inference (default: display images)
--checkpoint_path: Path to custom checkpoint file to override pretrained weights (default: use pretrained weights)

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
paper-figures		paper-figures
project		project
src/flowtrain		src/flowtrain
tests		tests
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
pyproject.toml		pyproject.toml
readme.md		readme.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📦 Installation

StructuralGeo Integration

Pretrained Models

Model Configuration (Summary)

Usage

Unconditional Model

Conditional Training & Inference

About

Uh oh!

Releases 3

Packages

Contributors 2

Uh oh!

Languages

License

chipnbits/flowtrain_stochastic_interpolation

Folders and files

Latest commit

History

Repository files navigation

📦 Installation

StructuralGeo Integration

Pretrained Models

Model Configuration (Summary)

Usage

Unconditional Model

Conditional Training & Inference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Uh oh!

Languages

Packages