Skip to content

Maintenance

Maintenance 28.09. - 30.09.2021

The following extensive changes have been performed:

  • A new Graphcore IPU-POD16 system has been be installed as part of the FTP-X86 cluster.

  • The FTP-X86n2 node (Cascade Lake + 1x V100) has been converted into a login node, removing the login node role from the FTP-X86 head node. From now on you have to use the DNS name ftp-x86-login.scc.kit.edu to log into this cluster. Please note that your SSH client will likely show a warning because the IP address of a known server has changed.

  • The NVIDIA V100 GPUs have been removed from the FTP-X86n[1,2] nodes and put into the FTP-X86n[3,4] nodes, turning these two nodes into 2x GPU nodes.

  • The InfinityFabric bridges necessary for fast Inter-GPU communication have been installed in the FTP-X86n[5,6] nodes.

  • The FTP-A64 cluster has been configured to use the HoreKa file systems for $HOME and Workspaces, just like the FTP-X86 cluster already does. The data previously residing in /home on the FTP-A64 nodes is still available in the path /mnt/oldhomes/, so users can migrate it on their own.

  • The ROCm software stack has been updated to version 4.3.1.

  • The firmware of many components has been updated.

Maintenance 09.09.2021

The FTP-X86n[5,6] nodes are now equipped with significantly more powerful AMD EPYC 7543 "Milan" processors. The new CPUs have 32 instead of 16 cores per socket and can execute a total of 128 threads. In addition, the new microarchitecture ("Milan" generation) achieves up to 20% higher performance per core. The distribution of the four GPUs across the two CPU sockets in the nodes has also been optimized during the maintenance.

The batch system partition amd-rome-mi100 has been renamed to amd-milan-mi100 to reflect the upgrade.


Last update: September 30, 2021