Building a supercomputer

Find out more about the complexities of launching the UK’s next national research supercomputer.

Over the past 12 months, ever since I learnt that Cray had been selected as the hardware provider for the new ARCHER2 supercomputer, I’ve been eagerly anticipating the day when we would open the service to users.

Over this time a lot has happened, Cray have been bought by Hewlett Packard Enterprise (HPE), and we’ve all had to cope with complexities of the COVID-19 pandemic. However, the good news for UK scientists is that the first part of the new national HPC service will shortly be opened to users.

In his blog post in July 2020, Professor Simon McIntosh-Smith explained what a supercomputer, what the current national supercomputer, ARCHER, has achieved and what we expect ARCHER2 to be capable of. With the imminent start-up of the ARCHER2 service I wanted to use this blog post to explain where we are and where we hope to get to over the next few months.

A long, complicated but successful summer

Building and installing a supercomputer is never an easy task, particularly a system as large and capable as ARCHER2. The move to a new underlying operating system for ARCHER2 and the COVID-19 pandemic has delayed its installation but in mid-July the first four of the final 23 cabinets arrived at our data centre, the ACF, in Edinburgh.

Because of the installation delays, UK Research and Innovation’s Engineering and Physical Sciences Research Council (EPSRC) asked us to keep the ARCHER service running throughout this year.

Over the past six weeks we’ve been installing and configuring the initial ARCHER2 system in a separate room to the current ARCHER service ready to finally begin the retirement of the current service.

An autumn transition to ARCHER2

The initial 4-cabinet ARCHER2 system is now up and running with HPE and EPCC staff working hard to configure it for the first users in the next few weeks. Although the system only represents 131,072 cores (the current ARCHER system has 118,080), each of these cores is at least 1.5 times more powerful.

So, although ARCHER2 will open with around the same number of cores as ARCHER, I hope that users immediately see a difference in the quantity of scientific results the system will be able to produce.

Once the 4-cabinet system is providing a service to users we’ll move into a very busy period. ARCHER will be turned off for the last time and dismantled. The final power and cooling preparations for the full system will be completed.

We’ve had to lay 107 new power cables for example! The system will then be installed by HPE and gradually brought to life. We hope to do this in a way that doesn’t involve any long period of downtime although there will of course be some interruptions to service.

The future is almost here

I always find this period of building a new service really exciting. These are very large complex computing systems and not everything will work first time. I think of our supercomputers like Formula 1 cars, they need a pit crew just to get them started, but once they’re going, they’re phenomenally powerful.

Over the winter, we hope that the full ARCHER 2 system will be brought into service and start delivering all the scientific benefits we’ve been looking forward to.

This is the website for UKRI: our seven research councils, Research England and Innovate UK. Let us know if you have feedback.