Container Standards in HPC

Introduction

As containers are getting widely used in the HPC world, a common NHR strategy is required to standardize container usage for users as well as for administrators in NHR and beyond. This project is focused on extending solutions developed in the NHR container project "Container & Container Management" in 2022, with the goal to improve user experience and applicability to a broader range of use cases. Our methods for creating and deploying portable, performance oriented containers will be enhanced by automated security checks and integrated performance monitoring. Our container repository will receive new features facilitating integration into web based documentation platforms. Cloud bursting will be extended to allow for cross site resource usage between NHR centers. NHR service provision via containers will become more portable between centers.

Figure: Components of the project

Project Leader Christian Boehme, NHR@Göttingen
Project Partners RWTH Aachen: Sascha Bücken 
Zuse-Institut Berlin: Tobias Watermann
KIT: Jennifer Buchmüller 
TU Darmstadt: Thorsten Reimann
Goethe University Frankfurt: Sarah Neuwirth
TU Dresden/ZIH: Ulf Markwardt
Participating
NHR Centers
:
NHR@Göttingen
NHR@KIT 
NHR@ZIB 
NHR@TUD
NHR4CES@RWTH 
NHR4CES@TUDa
NHR Süd-West
Software/Library community software / tools

 

Project description

The installation and configuration of applications in HPC is highly dependent on the actual software and hardware environment that users find on the HPC systems of the data centers (OS distribution, libraries, modules, network, architectures). This usually complicates provisioning of new software or new software versions as well as migration between data centers. Both are usually associated with a certain support effort. Furthermore, when moving to cloud resources (for instance, to offer applications as on-demand services) or to user’s hardware (for small tests and development) the HPC software environment must be reproduced manually. This project aims to mitigate these issues and offer a solution for users, also considering possible performance degradation due to different hardware used for building software within the container and running it. Apart from user containers, many HPC services could be offered for users in HPC using container management solutions and be migrated or duplicated between HPC centres. For this some standardization is also required.