skip to content

Date: June 21, 2023, 4–5:30 p.m. CET. Coffee and tea from 4–4:30 p.m.

Speaker: Christie Alappat (Erlangen National High Performance Computing Center), ORCID

Location:

     Weyertal 86–90, 50931 Cologne
     Mathematical Institute (Google Maps, OpenStreetMap)
     Seminar Room 1 (Room 0.05)

Title: Accelerating sparse iterative solvers using Recursive Algebraic Coloring Engine

Abstract: Sparse linear iterative solvers are essential for large-scale simulations. Often in many of these simulations, the majority of the time is spent in matrix power kernels (MPK), which compute the product between a power of a sparse matrix A and a dense vector x, i.e., Ap x. Current state-of-the-art implementations perform MPK by  calling repeated back-to-back sparse matrix-vector multiplications (SpMV), which requires to stream the large A matrix from the main memory p times. Using RACE, we can accelerate the MPK computations by keeping parts of the matrix A in cache for successive SpMV calls. RACE uses a level-based approach to achieve this: Levels are constructed using breadth-first search on the graph related to the underlying sparse matrix. These levels are then used to implement cache blocking of the matrix elements for high spatial and temporal reuse. The approach is highly efficient and achieves performance levels of 50–100 GF/s on a single modern Intel or AMD multicore chip, providing speedups of typically 2x – 4x compared to a highly optimized classical SpMV implementation.

After demonstrating RACE’s cache blocking approach, the talk sheds light on the application of the cache-blocked MPK kernels in iterative solvers. We discuss the benefit of integrating RACE library to Trilinos framework and demonstrate the speedups achieved in communication-avoiding s-step Krylov solvers, polynomial preconditioners, and algebraic multigrid (AMG) preconditioners. The talk concludes by showcasing the application of RACE-accelerated solvers in a real-world wind turbine simulation (Nalu-Wind) and highlights the new possibilities and perspectives opened up by RACE’s cache blocking technique.