2013
DOI: 10.1145/2427023.2427030
|View full text |Cite
|
Sign up to set email alerts
|

Elemental

Abstract: Parallelizing dense matrix computations to distributed memory architectures is a well-studied subject and generally considered to be among the best understood domains of parallel computing. Two packages, developed in the mid 1990s, still enjoy regular use: ScaLAPACK and PLAPACK. With the advent of many-core architectures, which may very well take the shape of distributed memory architectures within a single processor, these packages must be revisited since the traditional MPI-based approaches will likely need … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
27
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 149 publications
(28 citation statements)
references
References 27 publications
1
27
0
Order By: Relevance
“…Therefore, we calculate the compact SVDĴ ¼ U r S r V T r , which requires memory of OðN d N m Þ and is relatively fast because 2N d ≪ N m . Using a new distributed library for dense linear algebra (Poulson et al, 2012) Distance (km) Distance (km) Figure 13. The 50% PSF width for the model shown in Figure 8b and the field acquisition geometry.…”
Section: Sensitivity and Resolution Analysismentioning
confidence: 99%
“…Therefore, we calculate the compact SVDĴ ¼ U r S r V T r , which requires memory of OðN d N m Þ and is relatively fast because 2N d ≪ N m . Using a new distributed library for dense linear algebra (Poulson et al, 2012) Distance (km) Distance (km) Figure 13. The 50% PSF width for the model shown in Figure 8b and the field acquisition geometry.…”
Section: Sensitivity and Resolution Analysismentioning
confidence: 99%
“…The Elemental library [22] was incorporated into NEMO5 to perform a highly-efficient parallel diagonal- isation of the Hamiltonian and an unfolding algorithm was developed and integrated to unfold the band structure as it was calculated. The supercell tight-binding calculation generates a folded band structure, where this loss of a continuous E(k) relation complicates most forms of analysis.…”
Section: Methodsmentioning
confidence: 99%
“…3) Software : We compared SIPs with Elemental because of its proven performance. However, there are other libraries, such as ELPA, which have also been shown to scale on thousands of cores . Obviously, using different libraries or improvements in these libraries can change the crossover point.…”
Section: Comparisons With Other Methodsmentioning
confidence: 99%
“…Therefore, it is useful to determine this crossover point to identify which method should be preferred for different matrix sizes and sparsities. We compare SIPs with the generalized eigenvalue solver with a dense matrix storage in the recently developed linear algebra library, Elemental, [61,62] which has been shown [61,63] to have a better parallel performance compared to ScaLapack. [64] Our comparison is based on the computational time to find all of the eigensolutions in an interval corresponding to 60% (DNW, BDC) to 70% (CNT) of the spectrum.…”
Section: Comparisons With Other Methodsmentioning
confidence: 99%