A.M. TURING AWARD LAUREATES BY...

Birth: 18 July, 1950.

Education: Chicago State University, B.S. (mathematics), 1972; Illinois Institute of Technology, M.S. (computer science), 1973; University of New Mexico, Ph.D. (mathematics), 1980.

Experience: Argonne National Laboratory: Resident Student Associate, 1973; Research Associate, 1974; Assistant Computer Scientist 1975-80; Senior Computer Scientist 1980-1989. University of Tennessee: Distinguished Professor, 1989-2000; University Distinguished Professor 2000-2022; Professor Emeritus 2022-present. Oak Ridge National Laboratory: Distinguished Scientist, 1989-2000; Distinguished Research Participant 2000-present. Texas A&M University: Fellow of Institute for Advanced Study 2014–2018. Rice University: Adjunct Professor of Computer Science, 2002-present. Manchester University: Turing Fellow, 2007-present.

Honors and Awards (selected): Fellow of American Association for the Advancement of Science (1995); Fellow of the IEEE (1999); Elected to the National Academy of Engineering (2001); Fellow of the ACM (2001); IEEE Sid Fernbach Award (2003); IEEE Medal of Excellence in Scalable Computing (2008); SIAM Activity Group on Supercomputing Career Prize; IEEE Computer Society Charles Babbage Award (2011); ACM/IEEE Ken Kennedy Award (2013); SIAM/ACM Prize in Computational Science (2019); Foreign Fellow of the Royal Society (2019); IEEE Computer Pioneer Award (2020); ACM A.M. Turing Award (2021).

United States – 2021

CITATION

For his pioneering contributions to numerical algorithms and libraries that enabled high performance computational software to keep pace with exponential hardware improvements for over four decades

Jack J. Dongarra was born in Chicago in 1950 to a family of Sicilian immigrants. He remembers himself as an undistinguished student, burdened by undiagnosed dyslexia. Only in high school, did he begin to connect his science classes with his love of taking machines apart and tinkering with them. Dongarra majored in mathematics at Chicago State University, thinking that this would combine well with education courses to equip him for a high school teaching career. The first person in his family to go to college, he lived at home and worked in a pizza restaurant to cover costs.[i]

**EISPACK**

In 1972 Dongarra began an undergraduate internship at nearby Argonne National Laboratory. Dongarra was supervised by Brian Smith, a young researcher whose primary concern at the time was EISPACK. Matrix Eigenvalue calculation is crucial in many areas of scientific computing and modeling, from mechanics and geology to Google’s PageRank algorithm. James H. Wilkinson had recently received a Turing Award for his work with Christian H. Reinsch to develop new matrix methods optimized for electronic computers. But the new methods were complex and mathematically subtle, so most programmers still reached for methods given in their old college textbooks. EISPACK was a project to produce accurate, robust, efficient, well-documented and portable FORTRAN subroutines that implemented the new methods for matrix eigenvalue and eigenvector calculation. This code could be called as needed from the programs written by scientists and engineers to solve problems.

Dongarra was pressed into service running test problems through EISPACK and reporting errors to its developers. The experience hooked Dongarra on computing and kindled an interest in numerical methods. Setting aside plans for a master’s degree in physics, he made a late application to the Illinois Institute of Technology for its computer science program. He continued to work at Argonne one day a week with Brian Smith as the supervisor of his master’s thesis.

EISPACK was released in 1974. By then Dongarra was working full time at Argonne. As he recalls his experience there: “From my first time there as an undergraduate until I left it was a very, very positive experience …. The work environment was terrific, very rich and stimulating.” However, he soon realized that he would need a doctoral degree to progress in his new career. He took a leave of absence to enroll in the mathematics Ph.D. program of the University of New Mexico, working with Cleve Moler. While studying in New Mexico Dongarra worked part time to adapt existing methods and algorithms to function effectively on the novel vector architecture of Los Alamos’s Cray 1 computer.

**LINPACK**

EISPACK was followed by LINPACK, for the complementary activity of solving linear equations and linear least squares problems. The project was initiated by G.W. “Pete” Stewart, who worked with Cleve Moler and Jim Bunch, three academic with connections to the Argonne community. Dongarra’s primary initial responsibility was for developing a framework to test the package and its component parts. The design team reconvened in Argonne each summer to test and integrate code produced over the academic year. By the end of the project, according to Stewart, “Jack was showing exceptional promise” which convinced them to “give him a leg up and put him as the lead author.”[ii]

LINPACK quickly established itself as one of the most important mathematical software packages of its era. It proved itself on practically every computer model used by scientists and engineers during the 1980s, from supercomputers to Unix workstations and IBM personal computers. This remarkable flexibility came in large part from its use of the BLAS (Basic Linear Algebra Subprograms). The idea of BLAS originated around 1972 with Charles Lawson and Richard Hanson of NASA’s Jet Propulsion Laboratory. The BLAS specifications defined a set of simple vector operations, each with a standard calling sequence. LINPACK was the first major package to commit to BLAS and the two projects developed a symbiotic relationship.

LINPACK came with an extensive user guide. “In the appendix to that book,” recalls Dongarra, “I collected some timing information about various machines. The idea was to give users a handle on how much time it would take to solve their problem if they used our software.” For the sample problem he chose the largest a matrix that could be processed on all of the dozen or so machines on the list. This list of performance times grew and took on an independent life as the standard measure of a machine’s floating-point performance. Computer vendors proudly quoted benchmark scores and tweaked their wares to optimize performance. The list grew to encompass hundreds of results, then thousands. Even today, long after LINPACK was superseded for all other purposes, the benchmark is still used by the TOP500 project to create a widely quoted supercomputer performance league table.

This ACM video highlights some of Dongarra's key achievements. |

**Netlib**

One of Dongarra’s other projects during the 1980s was the creation of the mathematical software library NETLIB. Among the friends Dongarra made during a semester at Stanford as a graduate student were eminent numerical analyst Gene Golub and a young student called Eric Grosse. Golub suggested the idea of an electronic repository of software to ensure that code written by students as part of their research could be found and reused by others. Around 1985 Dongarra and Grosse, by then working at Bell Labs, decided to make the idea into a reality. In its early versions NETLIB worked over email – a program would extract requests from incoming email and reply with coded messages from which the requested code could be reassembled.[iii]. Grosse and Dongarra served as editors, deciding which software was worthy of inclusion. Code from many existing collections, such as routines from the ACM’s *Transactions on Mathematical Software*, was eventually merged into NETLIB.

**University of Tennessee**

In 1989 Dongarra left Argonne to accept dual appointments as a Distinguished Professor at the University of Tennessee, Knoxville, and Distinguished Scientist at the nearby Oak Ridge National Laboratory. Dongarra felt that “it would allow me to be a professor and to have students and to do other things that were perhaps not as easy to do at the lab.” To begin with, his group consisted of just one graduate assistant. By 2004 the Innovative Computing Laboratory had grown to encompass “about 50 people including research professors, postdocs, research assistants, graduate students, undergraduate students, programmers, an artist, secretarial staff, people who take care of our computers, and our support staff as well.” Other than Dongarra himself, the entire team was supported by grants and research contracts from organizations such as the NSF, DARPA, and the Department of Energy.

**LAPACK and BLAS**

Flexible as LINPACK was, it could not fully exploit the power of vector processing supercomputers and shared memory parallel machines, the most important of which during the mid-1980s were respectively the Cray-1 and its multi-processor stablemate the Cray X-MP. Dongarra’s next major project, LAPACK, replaced both LINPACK and EISPACK with functions optimized for the new architectures. Dongarra and James Demmel of the University of California, Berkeley were the primary designers. The first public release occurred in 1992, with subsequent major releases in 1994 and 1999. By the early 2000s, the once exotic architectural features that characterized the Cray machines, such as long instruction pipelines, vector processing capabilities, and shared-memory multiple-core architectures, were becoming standard features of personal computers.

LAPACK was developed in parallel with new BLAS specifications. The original Level-1 BLAS covered only scalar, vector, and vector-vector operations (i.e., those processing at most individual columns from two matrices). This forced application programmers to make significant changes to optimize for specific vector processing, cache memory, or multi-processor architectures. The new BLAS implementations covered standard operations on combining a matrix and a vector (Level 2) and two matrices (Level 3). This higher level of abstraction was the key to LAPACK’s ability to run efficiently on a huge range of architectures.

While BLAS underpinned high performance portable software, not all computer manufacturers provided high quality implementations for their machines. The ATLAS project, undertaken by Dongarra’s students, automated creating a highly optimized BLAS for particular computer architecture. After probing the capabilities of the machine it is running on, ATLAS generates many different versions of each BLAS routine and determines experimentally which of these give optimum performance.[iv]

**Distributed Computing**

Dongarra has continued to grapple with the challenges new computer architecture has posed for linear algebra. LAPACK worked well with computers with a handful of processors sharing a common memory. But supercomputing architectures continued to evolve, with massively parallel supercomputers and clusters of networked personal computers. The idea of a supercomputer built from thousands of cheap processors was attractive. Still, existing applications, algorithms, and software tools could unlock only a tiny fraction of the hardware’s theoretical performance.

The challenge was splitting a large job up into many small independent tasks and then finding an efficient way to coordinate their operation via exchanging messages between processes. ScaLAPACK (Scalable LAPACK), first released in 1993, was optimized for distributed memory computers. It relied on PVM (Parallel Virtual Machine), a standard mechanism developed by Dongarra’s team to present application programs with a virtual parallel computer to insulate them from the details of the cluster they were running. Many other scientific computing projects quickly adopted PVM.

Dongarra also played a crucial role in creating the Message Passing Interface (MPI) for distributed computation. Dongarra group agreed the first version of the standard in 1994. He recalls that “MPI was designed by a committee of twenty people from commercial settings, academic institutions, research centers, and vendors, all pouring their ideas into this package ….” MPI quickly won widespread adoption as the standard interface for application programmers in many languages to control communication and synchronization between distributed processes.

**Recent Work**

Most numerical algorithms work with floating point numbers. Modern processors typically provide arithmetic capabilities for 32-bit and 64-bit number representations. Working with 64-bit number representations is more accurate; working with 32-bit number representations is often much faster. In his 2006 Supercomputing Conference paper, “Exploiting the Performance of 32-bit Floating Point Arithmetic in Obtaining 64-bit Accuracy,” Dongarra showed how these capabilities could be combined creatively to maximize speed without sacrificing accuracy. The technique he introduced quickly became standard practice in machine learning systems, slashing memory usage and greatly improving performance.

By the 2010s another architectural shift was underway in supercomputing. Producing realistic 3D graphics requires a massive amount of computational power. The needs of video gamers drove the creation of ever more complex graphics cards, whose processing capabilities quickly outstripped those of conventional processor chips. High end graphics chips each incorporated hundreds of processor cores working in parallel and could be reprogrammed to handle tasks essential to scientific computing and artificial intelligence as well as the graphical rendering operations for which they were designed. Supercomputers ran thousands of these chips in parallel. Dongrarra described a new approach in his 2016 paper, “Performance, design, and autotuning of batched GEMM for GPUs.” It centered on automatically dividing large matrix computations into small blocks that could be handled independently. The new approach was implemented in the MAGMA and SLATE software libraries, and in another reformulation of BLAS: the Batched BLAS Standard.

Dongarra continued to work primarily at the University of Tennessee, achieving the rank of University Distinguished Professor in 1990, and with Oak Ridge. He also developed long term affiliations as Turing Fellow at the University of Manchester and with Rice University’s high performance computing researchers. Since 1992 he has been editor in chief of the *International Journal of High Performance Computing Applications* as well as founding the interest group on supercomputing of Society for Industrial and Applied Mathematics.

A lot has changed over Dongarra’s long career, but the software he helped to create is more important than ever. He has come to serve as a public face of the supercomputing user community. Scientific computing users no longer typically write FORTRAN code every time they need to run a new calculation, but code from Dongarra’s projects has been incorporated into tools such as MATLAB, Maple, Mathematica, and the R programming language. Efficient matrix operations are foundational to many of the computing services we rely on daily, from high performance video games through online services powered by artificial intelligence algorithms to accurate weather forecasts.

Author: Thomas Haigh

[i] J. J. Dongarra, "Oral history interview by Thomas Haigh, 26 April," Society for Industrial and Applied Mathematics, Philadelphia, PA, 2004; available from http://history.siam.org/oralhistories/dongarra.htm. All unattributed quotes from Dongarra in this profile are taken from this source.

[ii] G. W. P. Stewart, "Oral History Interview by Thomas Haigh, March 5-6," Society for Industrial and Applied Mathematics, Philadelphia PA, 2006; available from http://history.siam.org/oralhistories/stewart.htm.

[iii] The origins of Netlib are documented in J. J. Dongarra and E. Grosse, "Distribution of Mathematical Software by Electronic Mail," *Communications of the ACM*, vol. 30, no. 5, 1987, pp. 403-407. https://dl.acm.org/doi/10.1145/22899.22904

[iv] R. C. Whaley, A. Petitet and J. J. Dongarra, "Automated Empirical Optimization of Software and the Atlas Project," *Parallel Computing*, vol. 27, no. 1-2, 2001, pp. 3-35.