2013
DOI: 10.1016/j.procs.2013.05.295
|View full text |Cite
|
Sign up to set email alerts
|

Code Generation and Optimization of Distributed-memory Dense Linear Algebra Kernels

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
24
0

Year Published

2013
2013
2014
2014

Publication Types

Select...
2
2
1

Relationship

3
2

Authors

Journals

citations
Cited by 8 publications
(24 citation statements)
references
References 8 publications
0
24
0
Order By: Relevance
“…The FLAME interfaces for indexing are also included to omit indexing in favor of reasoning about matrix partitions. The benefit of these interfaces is that parallelizing most sequential DLA algorithms in high-performance Elemental code is rote (this is described and automated in [6,7]). An expert needs to decide which distributions are efficient and how to redistribute between them.…”
Section: Experiments With Interfacesmentioning
confidence: 99%
See 2 more Smart Citations
“…The FLAME interfaces for indexing are also included to omit indexing in favor of reasoning about matrix partitions. The benefit of these interfaces is that parallelizing most sequential DLA algorithms in high-performance Elemental code is rote (this is described and automated in [6,7]). An expert needs to decide which distributions are efficient and how to redistribute between them.…”
Section: Experiments With Interfacesmentioning
confidence: 99%
“…In the case of DLA libraries, much of an experts' development work is rote thanks to good abstraction, and we can indeed automate it. In this section, we present the basics of DxT [6,7], which is used to encode expert knowledge about DLA interfaces. A system can then utilize that knowledge to generate high performance code.…”
Section: Encoding Expert Knowledge For Automatic Code Generationmentioning
confidence: 99%
See 1 more Smart Citation
“…We have automated the exploration of these spaces (by generating all implementations using a methodical process) and we evaluate the efficiency of each implementation via cost estimation. 1 This is how we find the best-performing algorithm that experts would intuitively select [17,18,19]. In all tests, generated code is the same or better than experts' hand-produced implementations.…”
Section: Introductionmentioning
confidence: 99%
“…We begin with a brief overview of how we generate the space of implementations for a given operation in the domain. Our approach is called Design by Transformation (DxT) -more details are given in [9,17,18,19,27]. …”
Section: Introductionmentioning
confidence: 99%