/* ---- Google Analytics Code Below */

Tuesday, October 04, 2022

On Supercomputing Today

With all the talk about Quantum, why supercompute? Some background

There is Plenty of Room at The Top (of Supercomputing), By Bennie Mols

Commissioned by CACM Staff, October 4, 2022

Supercomputers are the Olympic champions of scientific computing. Through numerical simulations, they enrich our understanding of the world, be it stars lightyears away in the universe, the Earth's weather and climate, or the functioning of the human body.

For over four decades, Jack Dongarra has been a driving force in the field of high-performance computing. Earlier this year, Dongarra was awarded the 2021 ACM A.M. Turing Award for "his pioneering contributions to numerical algorithms and libraries that enabled high performance computational software to keep pace with exponential hardware improvements for over four decades."

Writer Bennie Mols met with Dongarra during the 9th Heidelberg Laureate Forum in Germany in September to talk about the present and future of high-performance computing. Dongarra, now 72, has been a University Distinguished Professor at the University of Tennessee (U.S.) and a Distinguished Research Staff Member at the U.S. Department of Energy's Oak Ridge National Laboratory since 1989.

Over the decades, what has been your driving force in your scientific research?

My background is in mathematics, especially in numerical linear algebra; all of my work stems from that. For many problems in computational science, such as in physics and chemistry, you need to solve systems of linear equations, so having software that can do that is important. You have to make sure the software runs in parity with the machine architecture, so you can actually get the high performance that the machine is capable of.

What are the most important requirements for software that runs on a supercomputer?

We want the software to be accurate. We want the scientific community to use and understand the software and may be even contribute to improvements. We want the software to perform well, to be portable over to different machines. We want the code to be readable and reliable, and finally, we want the software to enhance the productivity of the person who is using it.

Developing software that meets all these requirements is a non-trivial process. We are talking about millions of lines of code, and roughly every 10 years ,we see some major change in the machine's architecture. That causes a refactoring of the algorithms that we have, and the software that embodies those algorithms. The software follows the hardware, and there is still plenty of room at the top of supercomputing to getting to better-performing machines.

What is a current development in high-performance computing that excites you?

High-performance supercomputers are built on commodity parts, let's say the high-end chips that you and I can also buy, just many more of them. And typically we use some accelerators, in the form of GPUs, on top. We have boards of multiple chips, we put them in a rack, and many of these racks together form a supercomputer. We use commodity parts because it is cheaper, but if you would specially design the chips for doing scientific computations, you would get supercomputers that perform much better, and that is an exciting idea.

Actually, this is exactly what companies like Amazon, Facebook, Google, Microsoft, Tencent, Baidu, and Alibaba are doing; they are making their own chips. They can do this because they have enormous funding. Universities are always limited in funding, and therefore they unfortunately have to do with commodity stuff. This is related to one of my other worries: how do we keep talent in the scientific areas, rather than see them go to work for big companies that pay much better?'

What are other important developments for the future of high-performance computing?

There are a number of important things. It is clear that machine learning is already having an important impact on scientific computing, and this impact will only grow. I see machine learning as a tool that helps to solve the problems that computational scientists want to solve.

This goes together with another important development. Traditionally ,our hardware uses 64-bit floating point operations, so we represent numbers in 64 bits. But you can speed up the computations if you use fewer bits, say 32, 16, or even 8 bits. By speeding up your computation, you lose precision. Yet it looks like AI calculations can often do with fewer bits, 16 or even 8. It is an area of investigation to find out where this plays out well and where it will not.

Another area of investigation is about how you can start with a low-precision computation, get an approximation, and then later use higher-precision computation to refine the outcome. ..... '  (more at link)

No comments: