2020
DOI: 10.48550/arxiv.2006.06762
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Ansor : Generating High-Performance Tensor Programs for Deep Learning

Abstract: High-performance tensor programs are crucial to guarantee efficient execution of deep learning models. However, obtaining performant tensor programs for different operators on various hardware platforms is notoriously difficult. Currently, deep learning systems rely on vendor-provided kernel libraries or various search strategies to get performant tensor programs. These approaches either require significant engineering efforts in developing platform-specific optimization code or fall short in finding high-perf… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 49 publications
(48 reference statements)
0
10
0
Order By: Relevance
“…TVM [12,56] is a popular framework targets optimization of mainly computing intensive operators. People can provide schedules and TVM tunes the parameters automatically.…”
Section: Discussion About Tvmmentioning
confidence: 99%
See 1 more Smart Citation
“…TVM [12,56] is a popular framework targets optimization of mainly computing intensive operators. People can provide schedules and TVM tunes the parameters automatically.…”
Section: Discussion About Tvmmentioning
confidence: 99%
“…There are recent advances on code generation of compute intensive DNN layers. TVM [12], Ansor [56] and Halide [34] are capable to generate high performance kernels with well designed schedules. Ansor [56] also explores kernel fusion with tuning approach, with limited patterns supported.…”
Section: Related Workmentioning
confidence: 99%
“…TVM allows users to implement schedules for each new operator by hand. Due to the large number of operators involved, for each device we use TVM's default schedules; automatic schedule design [84] has yet to be incorporated into TVM. We then enable auto-tuning of parameter values within the schedule to find best performance.…”
Section: Implementation and Setupmentioning
confidence: 99%
“…However, since it generates assembly code, it is not able to run on different architectures like ARM. Regarding deep learning applications, a plethora of compilerbased approaches has arisen [9,10,61,77]. AutoTVM [10] generates the best implementation for a specific DNN by extracting domain-specific features from a given low-level abstract syntax tree.…”
Section: Related Workmentioning
confidence: 99%