Skip to content

black-hole-group/gpumonty

Repository files navigation

GPUmonty: A GPU-Accelerated Relativistic Monte Carlo Code

GPUmonty is a GPU-accelerated Monte Carlo radiative transfer code for simulating spectra from accreting black holes based on NVIDIA CUDA. It traces photon packets ("superphotons") through accretion flows around black holes and models: (i) synchrotron emission from hot electrons, (ii) photon propagation along geodesics in curved spacetime (Kerr metric), (iii) Compton scattering (Thomson and Klein-Nishina regimes), (iv) spectral synthesis at different viewing angles.

Key features:

  • Over 10x faster than CPU-based igrmonty
  • Interfaces with multiple GRMHD codes (iharm3D, harm)
  • iharm3D HDF5 input for simulation data

Use cases:

  • Simulating spectra from AGN and X-ray binaries
  • Inferring black hole properties from electromagnetic observations
  • Testing accretion models against multi-wavelength data

Technology stack:

  • CUDA C++ for GPU-accelerated parallel photon tracking
  • C / OpenMP for host-side computation
  • Python for post-processing and visualization
  • Dependencies: CUDA Toolkit, GSL, HDF5, OpenMP,

GPUmonty is based on igrmonty. Please refer to the documentation webpage for more details.

QUICKSTART - Personal Computer

Before proceeding, make sure you have a NVIDIA GPU with the required drivers, CUDA toolkit, HDF5 library and GSL installed.

  1. Compile (replace the number below with the number of CPU cores available):
cd gpumonty
make -j 24
  1. Download the test data from a GRMHD simulation:
curl -L --create-dirs -o './data/dump_SANE.h5' 'https://dataverse.harvard.edu/api/access/datafile/12137142'
  1. Run gpumonty:
./gpumonty -par template.par

You should now have a spectrum data file in output/sane_iharm.spec.

  1. Visualize the spectrum. You will need python with numpy, matplotlib and astropy libraries:
python python/example.py

If all goes well, you should now have a image in output/example.png with the spectrum emitted by a hot SANE RIAF. If not, keep reading.

Installation Instructions

Prerequisites

Before compiling, ensure your system has the following libraries installed and accessible:

Path to libraries

Locate the install paths for these libraries on your system and update the corresponding variables in the Makefile:

CUDA_PATH     = /usr/local/cuda
GSL_PATH      = /usr/local
HDF5_INCLUDE  = /usr/include/hdf5/serial
HDF5_LIB      = /usr/lib/x86_64-linux-gnu/hdf5/serial

Note

The makefile is set to automatically find the compute capability of your GPU.

Compute capability refers to the CUDA architecture version of your GPU (e.g., sm_86 for Ampere), which determines which GPU instructions and optimizations are used during compilation.

In case you want to do it yourself, set AUTO_CC ?= 0 and look for the compute capability on Nvidia's website.

Multi-Core Acceleration (OpenMP)

GPUmonty benefits from OpenMP for CPU-bound tasks such as data pre-processing and grid initialization. To enable multi-threaded CPU execution, add the following line to your .bashrc file:

export OMP_NUM_THREADS=XX

Replace XX with the desired number of threads. It is recommended to set this value equal to the number of physical cores on your CPU.

Compile

After you have configured the things above, compile with

make -j 15

where the number refers to the desired number of CPU cores to use. In case you want to compile it for debug:

make BUILD_TYPE=debug

CUDA Number of Blocks Configuration

The build system includes an auto-tuning feature that detects the hardware specifications of your GPU (specifically Device 0).

During compilation, the Makefile triggers a probe (defined in GetGPUBlocks.mk) that calculates the optimal number of blocks based on the GPU's multiprocessor count and blocks-per-multiprocessor limit. This process automatically updates the N_BLOCKS definition located in src/config.h. By default, this feature is enabled. If you wish to manually set N_BLOCKS to a fixed value in the config file, you can disable the auto-tuner by setting the GPU_TUNING flag to 0:

make GPU_TUNING=0

Warning

If you are running on a High Performance Computing (HPC) cluster, do not compile on the login/head node,
as these nodes often lack GPUs or possess different hardware than the compute nodes.

To ensure the auto-tuner detects the correct GPU architecture for your run, we recommend adding the compilation step directly inside your job submission script (e.g., Slurm or PBS script).

Simulation Setup

Simulation parameters are passed via a .par file. You can find a baseline configuration in template.par.

To run a simulation with custom parameters:

./gpumonty -par path/to/your_parameter_file.par

The following runtime parameters are supported:

Parameter Description
Ns Superphoton Count: The approximated total number of photon packets to be generated. Higher values improve the signal-to-noise ratio in the resulting spectrum.
dump Data Path: The relative or absolute path to the input GRMHD data file.
spectrum Output Name: The filename for the output spectral data (e.g., sane.spec).
MBH Black Hole Mass: Mass of the central black hole in Solar Masses ($M_\odot$).
M_unit Mass Unit Scale: The normalization factor (in grams) used to scale dimensionless GRMHD density to physical CGS units.
tp_over_te Proton-to-Electron Temperature Ratio: A constant ratio ($T_p/T_e$) used if a dynamic heating model is not active.
Thetae_max Temperature Ceiling: A numerical cap for the dimensionless electron temperature ($\Theta_e = k_B T_e / m_e c^2$).
scattering Scattering boolean: Enable or disable scattering processes in the simulation.
bremsstrahlung Bremsstrahlung boolean: Enable or disable Bremsstrahlung emission processes in the simulation.
synchrotron Synchrotron boolean: Enable or disable Synchrotron emission processes in the simulation.
fit_bias Fit Bias boolean: Enable or disable fit bias correction in the simulation.
bias_guess Bias Guess: Initial guess value for bias correction parameter in the simulation.
bias_ratio Bias Target Ratio: The target ratio parameter used for bias correction tuning in the simulation.

Data Analysis

To facilitate data post-processing and visualization, an example Jupyter Notebook is provided in the repository at python/example.ipynb. It contains a tutorial of how to process output files, extract spectra and generating plots.

When analyzing the raw results in Python, please note the relationship between luminosity and the observer's viewing angle: The luminosity array nuLnu is multi-dimensional; each index in the array corresponds directly to one of the theta_bins defined in your simulation.

GRMHD dump file for testing

To reproduce the tests using the same GRMHD data used in our paper, download this dump file from Prather et al. (2023). This can be done through terminal

curl -L -o './data/dump_SANE.h5' 'https://dataverse.harvard.edu/api/access/datafile/12137142'

or manual downloading on the website. This dump file corresponds to a snapshot from a SANE RIAF simulation around a black hole with $a_*=0.9375$. After downloading the dump file, place it in data/ directory (the command above already does this for you) and run

./gpumonty -par template.par

Citation

Numerical methods and profiling results are described in details in this paper:
GPUmonty: GPU-accelerated relativistic Monte Carlo radiative transfer code. arXiv: 2602.13198. Submitted to ApJ.

LICENSE

GPUmonty is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the LICENSE file or the GNU General Public License for more details.

About

A public, CUDA/C-based relativistic Monte Carlo radiative transfer code accelerated via GPUs. Designed for high-performance spectral modeling of accreting black holes, GPUmonty offloads photon tracking and scattering to achieve a considerable speedup over traditional CPU-based implementations like grmonty.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors