2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018
DOI: 10.1109/ipdps.2018.00070
|View full text |Cite
|
Sign up to set email alerts
|

Rethinking large-scale Economic Modeling for Efficiency: Optimizations for GPU and Xeon Phi Clusters

Abstract: We propose a massively parallelized and optimized framework to solve high-dimensional dynamic stochastic economic models on modern GPU-and MIC-based clusters. First, we introduce a novel approach for adaptive sparse grid index compression alongside a surplus matrix reordering, which significantly reduces the global memory throughput of the compute kernels and maps randomly accessed data onto cache or fast shared memory. Second, we fully vectorize the compute kernels for AVX, AVX2 and AVX512 CPUs, respectively.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
7
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(8 citation statements)
references
References 21 publications
0
7
0
1
Order By: Relevance
“…For the sake of brevity, we assume that the function value is zero on the domain's boundary. This is not a necessary condition, and it can be easily changed by augmenting the basis function (see, e.g., [47]). First, consider a one-dimensional domain, discretized with grid spacing h l = 2 −l .…”
Section: 2mentioning
confidence: 99%
See 2 more Smart Citations
“…For the sake of brevity, we assume that the function value is zero on the domain's boundary. This is not a necessary condition, and it can be easily changed by augmenting the basis function (see, e.g., [47]). First, consider a one-dimensional domain, discretized with grid spacing h l = 2 −l .…”
Section: 2mentioning
confidence: 99%
“…When solving the system of equations at a given grid point, one needs to frequently interpolate from the function computed in the previous iteration step. These interpolation operations can account for up to 99% of the overall compute time for solving the system of equations [47]. These two impediments make it difficult to achieve an acceptable time-to-solution which can quickly reach the order of days on modern supercomputing facilities.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…пример [9]) для решения нелинейного набора уравнений, и при этом должны выполняться как можно быстрее, чтобы гарантировать быстрое принятия решения. Несмотря на наличие широко применяемого формата данных с плотной матрицей (см., например, [9]), для которого существуют высоко оптимизированные алгоритмы для выполнения задач, связанных с интерполяцией, необходимо внедрение нового, общего метода сжатия данных для АРС [20]. Данный метод позволит работать с 16 непрерывными измерениями, то есть с 59-мерными АРС одновременно.…”
Section: повышениеunclassified
“…This methodology 1 In physics and engineering, Gaussian processes regression (see, e.g., Williams and Rasmussen, 2006;Tripathy, Bilionis, and Gonzalez, 2016;Bilionis and Zabaras, 2012a;Bilionis, Zabaras, Konomi, and Lin, 2013;Chen, Zabaras, and Bilionis, 2015), radial basis functions (Park and Sandberg, 1991), or relevance vector machines (Bilionis and Zabaras, 2012b) are often used to build surrogate models. More recently, following the rapid developments in the theory of stochastic optimization and artificial intelligence as well as the advances in computer hardware leading to the widespread availability of graphic processing units (GPUs; see, e.g., Scheidegger, Mikushin, Kubler, and Schenk (2018); Aldrich, Fernández-Villaverde, Gallant, and Rubio-Ramírez (2011), and references therein), researchers have turned their attention towards deep neural networks (see, e.g., Tripathy and Bilionis, 2018a;Liu, Borovykh, Grzelak, and Oosterlee, 2019a).…”
Section: Introductionmentioning
confidence: 99%