Numerical Mathematics and Advanced Applications 2011 2012
DOI: 10.1007/978-3-642-33134-3_68
|View full text |Cite
|
Sign up to set email alerts
|

A Fast GPU-Accelerated Mixed-Precision Strategy for Fully Nonlinear Water Wave Computations

Abstract: We present performance results of a mixed-precision strategy developed to improve a recently developed massively parallel GPU-accelerated tool for fast and scalable simulation of unsteady fully nonlinear free surface water waves over uneven depths (Engsig-Karup et.al. 2011). The underlying wave model is based on a potential flow formulation, which requires efficient solution of a Laplace problem at large-scales. We report recent results on a new mixed-precision strategy for efficient iterative high-order accur… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
20
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 9 publications
(20 citation statements)
references
References 7 publications
0
20
0
Order By: Relevance
“…Another reason to look for alternatives to GMRES is that we target platforms based on modern and emerging architectures and parallel computations on large distributed systems in ongoing work [10,26]. For high performance, this scope warrants minimization of communication patterns [43], for example, by using algorithms with minimal number of global inner products that require global communication [44,45] to secure good scalability on most general-purpose commodity architectures.…”
Section: On Properties Of Existing Iterative Strategies For High-ordementioning
confidence: 99%
See 4 more Smart Citations
“…Another reason to look for alternatives to GMRES is that we target platforms based on modern and emerging architectures and parallel computations on large distributed systems in ongoing work [10,26]. For high performance, this scope warrants minimization of communication patterns [43], for example, by using algorithms with minimal number of global inner products that require global communication [44,45] to secure good scalability on most general-purpose commodity architectures.…”
Section: On Properties Of Existing Iterative Strategies For High-ordementioning
confidence: 99%
“…It is notable that the memory reduction (Section 3.8) achieved by utilizing the PDC method comes at the price of reduced convergence rate compared with the time-constant LU left-preconditioned GMRES method for more strict tolerance levels. Further significant speedup can be achieved in parallel [10,26]. However, the PDC method is found to be increasingly less efficient compared with the GMRES method for the most strict tolerance levels.…”
Section: The Submerged Bar Test (2d)mentioning
confidence: 99%
See 3 more Smart Citations