Parameterized micro-benchmarking: An auto-tuning approach for complex applications

Wenjing Ma, Sriram Krishnamoorthy, Gagan Agrawal

Research output: Contribution to conferencePaperpeer-review

2 Scopus citations

Abstract

Auto-tuning has emerged as an important practical method for creating highly optimized code. However, the growing complexity of architectures and applications has resulted in a prohibitively large search space that preclude empirical auto-tuning. Here, we focus on the challenge to auto-tuning presented by applications that require auto-tuning of not just a small number of distinct kernels, but a large number of kernels that exhibit similar computation and memory access characteristics and require optimization over similar problem spaces. We propose an auto-tuning method for tensor contraction functions on GPUs, based on parameterized micro-benchmarks. Using our parameterized micro-benchmarking approach, we obtain a speedup of up to 2 over the version that used default optimizations without auto-tuning.

Original languageEnglish (US)
Pages181-182
Number of pages2
DOIs
StatePublished - 2011
Externally publishedYes
Event20th International Conference on Parallel Architectures and Compilation Techniques, PACT 2011 - Galveston, TX, United States
Duration: Oct 10 2011Oct 14 2011

Conference

Conference20th International Conference on Parallel Architectures and Compilation Techniques, PACT 2011
Country/TerritoryUnited States
CityGalveston, TX
Period10/10/1110/14/11

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Parameterized micro-benchmarking: An auto-tuning approach for complex applications'. Together they form a unique fingerprint.

Cite this