2004
DOI: 10.1137/s0036144503428693
|View full text |Cite
|
Sign up to set email alerts
|

Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software

Abstract: Matrix computations are both fundamental and ubiquitous in computational science and its vast application areas. Along with the development of more advanced computer systems with complex memory hierarchies, there is a continuing demand for new algorithms and library software that efficiently utilize and adapt to new architecture features. This article reviews and details some of the recent advances made by applying the paradigm of recursion to dense matrix computations on today's memory-tiered computer systems… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
87
0

Year Published

2005
2005
2017
2017

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 140 publications
(87 citation statements)
references
References 86 publications
0
87
0
Order By: Relevance
“…al [9] with packed and general data formats, and by Elmorth et. al [11] with recursive blocking. However, their formulations are designed for the hierarchical memory architecture of x86 multicore processors.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…al [9] with packed and general data formats, and by Elmorth et. al [11] with recursive blocking. However, their formulations are designed for the hierarchical memory architecture of x86 multicore processors.…”
Section: Related Workmentioning
confidence: 99%
“…al [9], and Elmorth et. al [11]. In this paper, we describe a recursive formulation of the TRMM and TRSM kernels, which suits well the aggressively parallel many-core GPUs architecture.…”
Section: Introductionmentioning
confidence: 99%
“…Since the early 1990s, various researchers [10,12,13,16] have proposed that matrices should be stored by blocks as opposed to the more customary columnmajor storage used in Fortran and row-major storage used in C. Doing so recursively is a generalization of that idea. The original reason was that by storing matrices contiguously a performance benefit would result.…”
Section: An Algorithm-by-blocksmentioning
confidence: 99%
“…On optimization of linear algebra library, contemporary research focuses on an algorithmic level [2,3]. In a period of time that CPU reads one byte from memory, it can execute hundreds of instructions.…”
Section: Introductionmentioning
confidence: 99%