Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels

Xiang Li; Gagan Agrawal

doi:10.1109/HiPC53243.2021.00040

Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels

Xiang Li, Gagan Agrawal

Computer & Cyber Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Autotuning has been widely studied in high performance computing as a very effective mechanism for improving application performance. Such an approach has become particularly crucial for architectures like the modern GPUs, where obtaining the best performance involves a complex interaction between the architecture and the applications. Autotuning methods rely upon a search strategy, which is designed to search through the (potentially very large) space. A large number of search methods have been proposed in the past, and include both local and global strategies. We observe that on GPU applications, high performing configurations are likely to be spatially clustered. Based on this observation, we propose to apply a strategy we refer to as shrinking sample. This method searches in all areas of the entire space, looking for combinations of different parameter values, and without relying on random (initial) choices that may miss a part of the space. The efficacy and efficiency of this method has been tested against state-of-the-art local and global search algorithms on seven benchmark GPU kernels. Our experiments show that the shrinking-sample method can achieve around 99% percent of the performance from exhaustive search (on average) with orders of magnitude much less tuning time.

Original language	English (US)
Title of host publication	Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	262-271
Number of pages	10
ISBN (Electronic)	9781665410168
DOIs	https://doi.org/10.1109/HiPC53243.2021.00040
State	Published - 2021
Event	28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021 - Virtual, Bangalore, India Duration: Dec 17 2021 → Dec 18 2021

Publication series

Name	Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021

Conference

Conference	28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
Country/Territory	India
City	Virtual, Bangalore
Period	12/17/21 → 12/18/21

ASJC Scopus subject areas

Artificial Intelligence
Computer Networks and Communications
Computer Science Applications
Hardware and Architecture
Information Systems

Access to Document

10.1109/HiPC53243.2021.00040

Cite this

Li, X., & Agrawal, G. (2021). Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels. In Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021 (pp. 262-271). (Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/HiPC53243.2021.00040

Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels. / Li, Xiang; Agrawal, Gagan.
Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021. Institute of Electrical and Electronics Engineers Inc., 2021. p. 262-271 (Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Li, X & Agrawal, G 2021, Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels. in Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021. Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021, Institute of Electrical and Electronics Engineers Inc., pp. 262-271, 28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021, Virtual, Bangalore, India, 12/17/21. https://doi.org/10.1109/HiPC53243.2021.00040

Li X, Agrawal G. Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels. In Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021. Institute of Electrical and Electronics Engineers Inc. 2021. p. 262-271. (Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021). doi: 10.1109/HiPC53243.2021.00040

Li, Xiang ; Agrawal, Gagan. / Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels. Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021. Institute of Electrical and Electronics Engineers Inc., 2021. pp. 262-271 (Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021).

@inproceedings{6babf291f142432a81e1df259b1a98ec,

title = "Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels",

abstract = "Autotuning has been widely studied in high performance computing as a very effective mechanism for improving application performance. Such an approach has become particularly crucial for architectures like the modern GPUs, where obtaining the best performance involves a complex interaction between the architecture and the applications. Autotuning methods rely upon a search strategy, which is designed to search through the (potentially very large) space. A large number of search methods have been proposed in the past, and include both local and global strategies. We observe that on GPU applications, high performing configurations are likely to be spatially clustered. Based on this observation, we propose to apply a strategy we refer to as shrinking sample. This method searches in all areas of the entire space, looking for combinations of different parameter values, and without relying on random (initial) choices that may miss a part of the space. The efficacy and efficiency of this method has been tested against state-of-the-art local and global search algorithms on seven benchmark GPU kernels. Our experiments show that the shrinking-sample method can achieve around 99% percent of the performance from exhaustive search (on average) with orders of magnitude much less tuning time.",

author = "Xiang Li and Gagan Agrawal",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE.; 28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021 ; Conference date: 17-12-2021 Through 18-12-2021",

year = "2021",

doi = "10.1109/HiPC53243.2021.00040",

language = "English (US)",

series = "Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "262--271",

booktitle = "Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021",

}

TY - GEN

T1 - Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels

AU - Li, Xiang

AU - Agrawal, Gagan

PY - 2021

Y1 - 2021

N2 - Autotuning has been widely studied in high performance computing as a very effective mechanism for improving application performance. Such an approach has become particularly crucial for architectures like the modern GPUs, where obtaining the best performance involves a complex interaction between the architecture and the applications. Autotuning methods rely upon a search strategy, which is designed to search through the (potentially very large) space. A large number of search methods have been proposed in the past, and include both local and global strategies. We observe that on GPU applications, high performing configurations are likely to be spatially clustered. Based on this observation, we propose to apply a strategy we refer to as shrinking sample. This method searches in all areas of the entire space, looking for combinations of different parameter values, and without relying on random (initial) choices that may miss a part of the space. The efficacy and efficiency of this method has been tested against state-of-the-art local and global search algorithms on seven benchmark GPU kernels. Our experiments show that the shrinking-sample method can achieve around 99% percent of the performance from exhaustive search (on average) with orders of magnitude much less tuning time.

AB - Autotuning has been widely studied in high performance computing as a very effective mechanism for improving application performance. Such an approach has become particularly crucial for architectures like the modern GPUs, where obtaining the best performance involves a complex interaction between the architecture and the applications. Autotuning methods rely upon a search strategy, which is designed to search through the (potentially very large) space. A large number of search methods have been proposed in the past, and include both local and global strategies. We observe that on GPU applications, high performing configurations are likely to be spatially clustered. Based on this observation, we propose to apply a strategy we refer to as shrinking sample. This method searches in all areas of the entire space, looking for combinations of different parameter values, and without relying on random (initial) choices that may miss a part of the space. The efficacy and efficiency of this method has been tested against state-of-the-art local and global search algorithms on seven benchmark GPU kernels. Our experiments show that the shrinking-sample method can achieve around 99% percent of the performance from exhaustive search (on average) with orders of magnitude much less tuning time.

UR - http://www.scopus.com/inward/record.url?scp=85125668162&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85125668162&partnerID=8YFLogxK

U2 - 10.1109/HiPC53243.2021.00040

DO - 10.1109/HiPC53243.2021.00040

M3 - Conference contribution

AN - SCOPUS:85125668162

T3 - Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021

SP - 262

EP - 271

BT - Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021

Y2 - 17 December 2021 through 18 December 2021

ER -

Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this