Skip to content

NHR@KIT User Documentation

GPU Offloading

NHR@KIT User Documentation

Start
HoreKa
HoreKa
- Project management
  Project management
- Using HoreKa
  Using HoreKa
- Compilers & Runtimes
  Compilers & Runtimes
- Parallel and GPU Programming models
  Parallel and GPU Programming models
- Debugging
  Debugging
- Performance Optimization
  Performance Optimization
- Advanced topics
  Advanced topics
- Support
  Support
HoreKa 2
HoreKa 2
HAICORE
HAICORE
- Project management
  Project management
  - Acknowledgements
- Using HAICORE
  Using HAICORE
- Compilers & Runtimes
  Compilers & Runtimes
- Parallel and GPU Programming models
  Parallel and GPU Programming models
- Performance Optimization
  Performance Optimization
  - File systems performance tuning
- Advanced topics
  Advanced topics
- Support
  Support
Future Technologies Partition (FTP)
Future Technologies Partition (FTP)
- Using the FTP
  Using the FTP
- Hardware overview
  Hardware overview
  - Gaudi 2/3
- Compilers & Runtimes
  Compilers & Runtimes
  - Compilers Overview
  - AMD ROCm
- Parallel and GPU Programming models
  Parallel and GPU Programming models
- Advanced topics
  Advanced topics
  - Using SSH Keys
  - Containers
- Support
  Support
Continuous Integration
Continuous Integration
Jupyter
Jupyter
- Getting access
- Using Jupyter
  Using Jupyter
- Advanced topics
  Advanced topics
  - Software
  - Container

GPU Offloading¶

The accelerated partition of the HoreKa cluster consists of nodes that draw their computing power mainly from the installed GPUs. To leverage these devices, the computing operations must be transferred from the host CPU to the GPUs, which is referred to as offloading.

In recent years, four main programming approaches have emerged that can be used to perform this offloading. The following sections show how these can be used on HoreKa:

In order for the compilers to apply the appropriate optimizations, it is recommended to compile directly on the target architecture. I.e. it usually makes sense to compile programs with GPU offloading on compute nodes equipped with GPUs. On the HoreKa cluster you can start an interactive job on the GPU development partition for this purpose:

$ salloc -p <partition_name> --gres=gpu:1 -t 60