April 4, 2025
Conference Paper

Accelerating GNNs on GPU Sparse Tensor Cores through N:M Sparsity-Oriented Graph Reordering

Abstract

Recent advancements in GPU hardware support have introduced the capability to leverage N:M sparse patterns for substantial performance gains. Graphs in Graph Neural Networks (GNNs) are typically sparse, but the sparsity is often irregular, not conforming to such sparse patterns. In this paper, we propose a novel graph reordering algorithm, the first of its kind, to reshape irregular graph data into the N:M structured sparse pattern at the tile level, allowing linear-algebra-based graph operations in GNNs to benefit from the N:M sparse hardware. The optimization is lossless, maintaining the accuracy of GNN. It can remove 98-100\% violations of the N:M sparse patterns at the vector level, and increase the proportion of conforming graphs in SuiteSparse collection from 5-9\% to 88.7-93.5\%. On A100 GPUs, the optimization accelerates Sparse Matrix Matrix (SpMM) by up to 43X (2.3X -- 7.5X on average) and speeds up the key graph operations in GNNs on real graphs by as much as 8.6X (3.5X on average).

Published: April 4, 2025

Citation

Chen J., H. Sung, R. Zhang, A. Li, and X. Shen. 2025. Accelerating GNNs on GPU Sparse Tensor Cores through N:M Sparsity-Oriented Graph Reordering. In Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP 2025), March 1-5, 2025, Las Vegas, NV, 16 - 28. New York, New York:Association for Computing Machinery. PNNL-SA-193030. doi:10.1145/3710848.3710881