2020
DOI: 10.1098/rsta.2019.0055
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical algorithms on hierarchical architectures

Abstract: A traditional goal of algorithmic optimality, squeezing out flops, has been superseded by evolution in architecture. Flops no longer serve as a reasonable proxy for all aspects of complexity. Instead, algorithms must now squeeze memory, data transfers, and synchronizations, while extra flops on locally cached data represent only small costs in time and energy. Hierarchically low-rank matrices realize a rarely achieved combination of optimal storage complexity and high-computational intensity for a wide… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(18 citation statements)
references
References 32 publications
0
17
0
1
Order By: Relevance
“…The original FMM method is kernel-dependent, but later on several kernel-independent methods have been proposed, kernel independent FMM, e.g., [70], hierarchical matrices, or H 2 matrices. For a discussion of hierarchical matrices see, e.g., [71], [72], [73], and [74].…”
Section: Exploiting Data Sparsitymentioning
confidence: 99%
“…The original FMM method is kernel-dependent, but later on several kernel-independent methods have been proposed, kernel independent FMM, e.g., [70], hierarchical matrices, or H 2 matrices. For a discussion of hierarchical matrices see, e.g., [71], [72], [73], and [74].…”
Section: Exploiting Data Sparsitymentioning
confidence: 99%
“…which produces the sum of a hierarchical matrix A of size N × N and a globally low rank matrix whose X and Y factors are of size N × k with k N . This operation can be efficiently implemented [30,54] by first adding the contributions of XY T to the various blocks of A at all levels, and recompressing the resulting sum algebraically as described earlier. The low rank update operation is a key routine for an operation that generates an explicit hierarchical matrix representation of an operator accessible only via matrix vector products.…”
Section: General Linear Algebra Operations On Hierarchical Matricesmentioning
confidence: 99%
“…Modern scientific workstations generally feature manycore GPU accelerators, and algorithms that do not effectively take advantage of these architectures are unlikely to be competitive for scientific and financial applications. Modern GPU architectures feature decreasing ratios of memory bandwidth to processing power, smaller amounts of fast memory per processing core, and substantial latencies for accessing data in deep memory [30]. Competitive algorithms must therefore be able to orchestrate their computations for effective execution in this environment.…”
Section: Introductionmentioning
confidence: 99%
“…• In our work we tackle multicore architectures. There exist recent research efforts to execute H-Matrices operations in distributed systems [74,116]. In fact, we have already developed a distributed memory implementation of the H-Chameleon library.…”
Section: Open Research Linesmentioning
confidence: 99%
“…• En nuestro trabajo abordamos ejecuciones paralelas en arquitecturas multinúcleo. Hay estudios recientes que tratan de ejecutar operaciones sobre H-Matrices en sistemas distribuidos [74,116]. De hecho, nosotros hemos desarrollado una implementación de la biblioteca H-Chameleon para memoria distribuida.…”
Section: Líneas De Investigación Abiertasunclassified