Skip to content

OpenMP Offloading

taflynn edited this page Nov 20, 2024 · 1 revision

Overview

This (example) project focused on benchmarking GPU OpenMP-offloading performance, using a variety of preexisting benchmarks. This project used a variety of:

Benchmarks

This work used 6 benchmarking applications, all of which support OpenMP-offloading, whilst others offer also CUDA and HIP versions.

A GPU port of the STREAMS benchmark

A neutron scattering kernels

A neutron scattering kernels

SU(3) kernels from QCD code

Kernels from the BUDE protein simulation code

Kernels from the QMCPACK, quantum Monte Carlo code

Compilers

ARCHER2 - AMD

ARCHER2 is a Cray built system and so makes use of the Cray programming environment and so, along with it's AMD architecture has two current OpenMP-offload enable compilers:

  • AMD
  • Cray

Cirrus - Nvidia

Cirrus supports several Nvidia GPU V100 nodes and so the OpenMP-offload enabled compilers are:

  • Nvidia compilers (nvc, nvc++, nvfortran) - supplied via the Nvidia HPC-SDK (Software Developer Kit)
  • GCC compilers

EIDF - Nvidia

  • Nvidia compilers (nvc, nvc++, nvfortran) - supplied via the Nvidia HPC-SDK (Software Developer Kit) container image

Compute systems

This project relied on two traditional HPC systems Cirrus and ARCHER2, and a cloud platform: the Edinburgh International Data Facility (EIDF)

The UK National supercomputing service

An older Tier-2 system, which has a substantial number of Nvidia V100 GPUs.

A cloud platform focused on machine learning applications, with a GPU service that consists of both Nvidia A100 and H100 GPUs

Deployment

One of the traditional schedulers for submitting jobs to HPC systems.

Kubernetes is a system for managing cloud resources and deploying containerised applications. It is notorious for having a steep learning curve for beginners; however, for getting started on the EIDF GPU service, there are useful getting started examples.

Clone this wiki locally