2014 IEEE 28th International Parallel and Distributed Processing Symposium 2014
DOI: 10.1109/ipdps.2014.59
|View full text |Cite
|
Sign up to set email alerts
|

Nitro: A Framework for Adaptive Code Variant Tuning

Abstract: Abstract-Autotuning systems intelligently navigate a search space of possible implementations of a computation to find the implementation(s) that best meets a specific optimization criteria, usually performance. This paper describes Nitro, a programmer-directed autotuning framework that facilitates tuning of code variants, or alternative implementations of the same computation. Nitro provides a library interface that permits programmers to express code variants along with metainformation that aids the system i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
42
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 56 publications
(42 citation statements)
references
References 38 publications
0
42
0
Order By: Relevance
“…The 1-1 mapping configuration indicates that the kernel is configured to have as many blocks (or threads, in the case of thread-mapped child kernels) as the number of items in the buffer. The exhaustive search configuration is the best configuration we find from exhaustively searching the configuration space [16]. In Figure 6 we report the results on tree descendants for all considered consolidation granularities over two datasets.…”
Section: B Selection Of the Kernel Configurationmentioning
confidence: 99%
“…The 1-1 mapping configuration indicates that the kernel is configured to have as many blocks (or threads, in the case of thread-mapped child kernels) as the number of items in the buffer. The exhaustive search configuration is the best configuration we find from exhaustively searching the configuration space [16]. In Figure 6 we report the results on tree descendants for all considered consolidation granularities over two datasets.…”
Section: B Selection Of the Kernel Configurationmentioning
confidence: 99%
“…The resulting number of code variants can be large and efficient heuristic search strategies are important, including Simulated Annealing or Nelder-Mead [31]. OpenTuner [22] provides several search strategies, but other techniques based on machine learning are also investigated [32]. Energy and performance autotuning for two irregular applications, graph community detection using the Louvain method (Grappolo) and high-performance conjugate gradient (HPCCG) have been investigated in [33] for OpenMP multithreaded programs using Open-Tuner.…”
Section: Related Workmentioning
confidence: 99%
“…Input-aware auto-tuning arose recently [10] as a way to solve this issue, and has been since then applied to a variety of problems including poly-algorithmic selection [3], OpenACC loops optimization [11], and general-purpose GPU compilers [12,14]. This surge of interest is encouraging, but has yet to win over an industry dominated by manual heuristics.…”
Section: Related Workmentioning
confidence: 99%