Engineering Channel: Summit Supercomputer

Summit or OLCF-4 is a supercomputer developed by IBM for use at Oak Ridge National Laboratory, which as of November 2018 is the fastest supercomputer in the world, capable of 200 petaflops. Its current LINPACK is clocked at 148.6 petaflops. As of November 2018, the supercomputer is also the 3rd most energy efficient in the world with a measured power efficiency of 14.668 GFlops/watt. Summit is the first supercomputer to reach exaop (a quintillion operations per second) speed, achieving 1.88 exaops during a genomic analysis and is expected to reach 3.3 exaops using mixed precision calculations.

The United States Department of Energy awarded a $325 million contract in November, 2014 to IBM, Nvidia and Mellanox. The effort resulted in construction of Summit and Sierra. Summit is tasked with civilian scientific research and is located at the Oak Ridge National Laboratory in Tennessee. Sierra is designed for nuclear weapons simulations and is located at the Lawrence Livermore National Laboratory in California. Summit is estimated to cover the space of two basketball courts and require 136 miles of cabling. Researchers will utilize Summit for diverse fields such as cosmology, medicine and climatology.

In 2015, the project called Collaboration of Oak Ridge, Argonne and Lawrence Livermore (CORAL) included a third supercomputer named Aurora and was planned for installation at Argonne National Laboratory. By 2018, Aurora was re-engineered with completion anticipated in 2021 as an exascale computing project along with Frontier and El Capitan to be completed shortly thereafter.

Each one of its 4,608 nodes (9,216 IBM POWER9 CPUs and 27,648 Nvidia Tesla GPUs) has over 600 GB of coherent memory (6×16 = 96 GB HBM2 plus 2×8×32 = 512 GB DDR4 SDRAM) which is addressable by all CPUs and GPUs plus 800 GB of non-volatile RAM that can be used as a burst buffer or as extended memory. The POWER9 CPUs and Volta GPUs are connected using Nvidia's high speed NVLink. This allows for a heterogeneous computing model. To provide a high rate of data throughput, the nodes will be connected in a non-blocking fat-tree topology using a dual-rail Mellanox EDR InfiniBand interconnect for both storage and inter-process communications traffic which delivers both 200Gb/s bandwidth between nodes and in-network computing acceleration for communications frameworks such as MPI and SHMEM/PGAS. More details