Maintenance 19.04. - 26.04.2022¶
The following changes have been performed during the maintenance:
All firmware versions on all components have been upgraded
The operating system version is now based on Red Hat Enterprise Linux (RHEL) 8.4. We recommend to re-compile all applications after the upgrade.
The Mellanox OFED InfiniBand stack has been upgraded.
The obsolete Intel compiler version 18.0 has been removed. The officially supported Intel compiler versions are now 19.0, 19.1 and 2021.4.0 (oneAPI).
LLVM version 14 was added. Older LLVM modules have been removed.
OpenMPI 4.0 and 4.1 have been updated to the latest patchlevel. OpenMPI 3.0 has been removed.
Many software modules have been updated and built against the new compiler and MPI versions
The system Python version 3.9 was added. If no other Python module is loaded, the command python3.9 defaults to version 3.9.2, the command python3.8 defaults to version 3.8.6, the commands python3 and python default to version 3.6.8 and the command python2 defaults to version 2.7.18.
The hpc-workspace tools have been updated to version 1.3.7.
The Lmod module system has been upgraded.
cmake 3.23 has been added.
Slurm has been upgraded to version 21.08.7.
HKFS Storage: new controller firmware
The Spectrum Scale, Lustre and BeeGFS file system clients were updated
The NVIDIA driver was upgraded to version 510.47.03. Cuda version 11.6 has been added.
Enroot has been updated to 3.4.0.
Singularity has been updated to 3.8.7.
Jupyterhub version has been upgraded to 2.2.2.
29.11.2021 Change to Memory Management settings¶
On 29.11.2021 we will change two operating system settings affecting memory management on HoreKa. We expect that these changes can make your applications run faster.
Enablement of "Transparent Huge Pages"¶
The size of memory pages will be increased. This decreases the memory management overhead for many memory access patterns, resulting in a speedup.
More information can be found here. The value will be set to "always".
Activation of "zone_reclaim_mode"¶
More intensive attempts than before are made to create new memory pages in the NUMA-domain (CPU socket) in which the associated process is running. This avoids (slower) memory accesses to the memory of other CPU sockets.
More information can be found here. The value will be set to "1" (Zone reclaim on).
Maintenance 12.07. - 16.07.2021¶
From July 12th 9:00 am until July 16th noon no compute nodes will be available on HoreKa and HAICORE, so no jobs will run. Additionally, individual login nodes will be unavailable for some time during this interval, which will also affect the Jupyter and CI services.
HoreKa will go into full operation on 01.06.2021 as planned. To perform the last preparatory steps, a planned maintenance interval has taken place on
Friday, 28.05.2021 between 08:00 and 15:00
The login nodes have been reinstalled and restarted. Running jobs have not been aborted. Waiting jobs have started automatically after maintenance was completed. Data on the parallel file systems was preserved, data stored locally on the nodes (e.g. in /tmp) has been deleted.
Please note that the following major changes have taken place:
The configuration of the batch system partitions has been adjusted. In particular, the memory limit for some node types in the queue "cpuonly" has been reduced to 1600 MB per CPU. Please check the updated information and adjust your job scripts if necessary.
The Python modules
devel/python/3.6have been renamed to
devel/python/3.6_intel. Please note that you do not need to load a module to use the default system Python version 3.6.8.
Python version 3.8 has been made available in addition to the default system version 3.6.8. You can either use the
pip3.8commands to explicitely request this version. Or you can load the
devel/python/3.8module, which will override the