Paper
13 April 2023 Cooperative compilation optimization of register allocation and thread management for AMDGPU
Lin Han, Youzhi Ren, Hongsheng Wang, Yunda Chai, Dan Zhang
Author Affiliations +
Proceedings Volume 12605, 2022 2nd Conference on High Performance Computing and Communication Engineering (HPCCE 2022); 126050D (2023) https://doi.org/10.1117/12.2673315
Event: Second Conference on High Performance Computing and Communication Engineering, 2022, Harbin, China
Abstract
Thread parallelism and single-thread’ performance are two important factors affecting the performance of kernel functions, and they are both closely related to register allocation. According to change the thread parallelism to optimize GPU register resource allocation can effectively improve the performance of heterogeneous programs. We obtain the required number of vector registers by counting the number of virtual registers during the compilation of kernel functions, and then combine them with the number of wavefronts used to launch kernel functions for overall performance analysis, proposing a RAW compilation method for collaborative optimization of register allocation and thread management for AMDGPU, which is implemented in the LLVM compiler. It is verified that the method has a speedup ratio of about 1.12x for the Rodinia test set and about 1.4x for the quda application.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Lin Han, Youzhi Ren, Hongsheng Wang, Yunda Chai, and Dan Zhang "Cooperative compilation optimization of register allocation and thread management for AMDGPU", Proc. SPIE 12605, 2022 2nd Conference on High Performance Computing and Communication Engineering (HPCCE 2022), 126050D (13 April 2023); https://doi.org/10.1117/12.2673315
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Design and modelling

Wavefronts

Data storage

Computer hardware

Computing systems

Performance modeling

Back to Top