Skip to content

RRTMGP CUDA kernels slow when called from ClimaAtmos #3742

@sriharshakandala

Description

@sriharshakandala

RRTMGP CUDA kernels take much longer when called from ClimaAtmos compared to the corresponding RRTMGP benchmark for an equivalent problem size.

Please see below for the corresponding ClimaAtmos and RRTMGP builds for the equivalent all-sky problem with aerosols. The number of aerosols is reduced to two to match with the RRTMGP benchmark.

ClimaAtmos PR: #3740

ClimaAtmos build: gpu_aquaplanet_dyamond - strong scaling - 1 GPU - No MPI @
https://buildkite.com/clima/climaatmos-target-gpu-simulations/builds/414#0195ce32-92da-478d-9f29-282e94952871

RRTMGP build: GPU all-sky with aerosols DYAMOND benchmark @ https://buildkite.com/clima/rrtmgp-clima-a100-pipeline/builds/46#_

In this example, the CUDA kernels for the longwave and shortwave problems take about 1.5x and 1.6x the time taken by the full longwave or shortwave solver in RRTMGP.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions