2014 IEEE 28th International Parallel and Distributed Processing Symposium 2014
DOI: 10.1109/ipdps.2014.97
|View full text |Cite
|
Sign up to set email alerts
|

A New Scalable Parallel Algorithm for Fock Matrix Construction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
28
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 24 publications
(28 citation statements)
references
References 26 publications
0
28
0
Order By: Relevance
“…All of these packages face similar challenges, including how to load-balance the computation, how to manage locality, and how to efficiently execute computation while providing an easy interface. They use either a tasking model for load-balancing computation with simple data layout [13,11,12], or a complex data layout with bulk synchronous communication [7]. A single programming model is used by these packages for all problems and at all run scales.…”
Section: Related Workmentioning
confidence: 99%
“…All of these packages face similar challenges, including how to load-balance the computation, how to manage locality, and how to efficiently execute computation while providing an easy interface. They use either a tasking model for load-balancing computation with simple data layout [13,11,12], or a complex data layout with bulk synchronous communication [7]. A single programming model is used by these packages for all problems and at all run scales.…”
Section: Related Workmentioning
confidence: 99%
“…Second, there is a need for efficient atomic accumulate operations to update tiles of the global matrices without the explicit participation of the target process. Finally, dynamic load balancing in HF is usually controlled by either a single atomic counter [1], or many counters [6], [7], both of which require a fast onesided fetch-and-add implementation.…”
Section: B Pgas In Quantum Chemistrymentioning
confidence: 99%
“…Although individual two-electron integrals do not possess drastic differences in execution time, bundles of shell quartets can vary greatly in cost. It is necessary to designate shell quartets as task units in HF codes because it enables the reuse of intermediate quantities shared by basis functions within a quartet [13], [6]. The goal is to assign these task units to processors with minimal overhead and a schedule that reduces the makespan.…”
Section: E Load Balancing In Quantum Chemistrymentioning
confidence: 99%
“…Their focus was on large heterogeneous clusters [26]. Foster et al presented scalable algorithms for distributing and constructing the Fock matrix in SCF problems on several massively parallel processing platforms [7].…”
Section: Related Workmentioning
confidence: 99%