Crowpeas is a Python package for Extended X-ray Absorption Fine Structure (EXAFS) fitting using neural networks. It provides tools to generate synthetic spectral data, train various neural network models, and make predictions on experimental data.
- Generate synthetic EXAFS spectra for training
- Train neural network models:
- Multi-Layer Perceptron (MLP)
- Convolutional Neural Network (CNN) [in testing]
- Make predictions on experimental data with uncertainty estimation
- Visualize results and compare with traditional fitting approaches
- Optional PyTorch model compilation for faster execution (requires C++ compiler)
The easiest way to install crowpeas is using the provided installation script, which:
- Checks for and installs the uv package manager if needed
- Sets up a virtual environment
- Installs PyTorch with appropriate CUDA support
- Installs crowpeas and its dependencies
# Clone the repository
git clone https://github.com/BNL-ML-Group/Crowpeas.git
cd crowpeas
# Run the installation script
python install.pyAfter installation, activate the virtual environment:
# On Linux/macOS
source .venv/bin/activate
# On Windows
.venv\Scripts\activateIf you prefer to install manually, follow these steps:
- Install PyTorch (install with appropriate CUDA support for GPU acceleration):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121- Install crowpeas:
pip install git+https://github.com/BNL-ML-Group/Crowpeas.gitcrowpeas -gThis will create a config.toml file and a Pt_feff0001.dat file in your current directory.
crowpeas -d -t -v config.tomlThis command:
- Generates a synthetic dataset (
-d) - Trains a neural network model (
-t) - Validates the model and creates plots (
-v)
crowpeas -e config.tomlCrowpeas uses TOML configuration files to specify:
- Training dataset parameters
- Neural network architecture and hyperparameters
- Experimental data paths and settings
Example configuration:
[general]
title = "Pd training"
mode = "training"
output_dir = "output"
[training]
feffpath = "Pt_feff0001.dat"
training_set_dir = "training_set"
num_examples = 10000
input_type = "q"
k_range = [2.5, 12.5]
k_weight = 2
[neural_network.architecture]
type = "MLP"
activation = "gelu"
output_dim = 4
hidden_dims = [1024, 516, 516, 516]Crowpeas provides a rich command-line interface with the following options:
-g, --generate: Generate a sample configuration file-d, --dataset: Generate a synthetic dataset-t, --training: Train a neural network model-r, --resume: Resume training from a checkpoint-v, --validate: Validate the model and create plots-e, --experiment: Make predictions on experimental data-p, --plot: Plot experimental data--data-path: Specify path to experimental data (with-p)
CROWPEAS_COMPILE: Set to "1", "true", or "yes" to enable PyTorch model compilation for faster execution. Requires a working C++ compiler (default: disabled).
For development, install the package with development dependencies:
pip install -e ".[dev]"Then you can run:
# Run tests
pytest
# Run code style checks
ruff check .
# Run type checks
pyrightThis project is licensed under the MIT License - see the LICENSE file for details.
If you use crowpeas in your research, please cite:
@article{MARCELLA2025116145,
title = {First shell EXAFS data analysis of nanocatalysts via neural networks},
journal = {Journal of Catalysis},
volume = {447},
pages = {116145},
year = {2025},
issn = {0021-9517},
doi = {https://doi.org/10.1016/j.jcat.2025.116145},
url = {https://www.sciencedirect.com/science/article/pii/S0021951725002106},
author = {Nicholas Marcella and Ryuichi Shimogawa and Yongchun Xiang and Anatoly I. Frenkel},
abstract = {Understanding the mechanisms of work of nanoparticle catalysts requires the knowledge of their structural and electronic descriptors, often measured in operando X-ray absorption fine structure (XAFS) spectroscopy experiments. We introduce a neural-network-based framework for rapidly mapping the extended XAFS (EXAFS) spectra onto structural parameters as an alternative to the commonly used non-linear least-squares fitting approaches. Our method leverages a multilayer perceptron trained on theoretical EXAFS and validated against theoretical test data and experimental spectra of frequently used nanoparticle types. The network helps lower the correlation between parameters, achieves high accuracy in the presence of noise and glitches, and can provide real-time parameter predictions with minimal user intervention. Parameter uncertainties are estimated as well. This method can be readily integrated into beamline pipelines or laboratory data analysis workflow and has the potential to accelerate high-throughput catalyst characterization and testing.}
}
Contributions are welcome! Please feel free to submit a Pull Request.