2015
DOI: 10.1155/2015/316012
|View full text |Cite
|
Sign up to set email alerts
|

A Performance Study of a Dual Xeon-Phi Cluster for the Forward Modelling of Gravitational Fields

Abstract: With at least 60 processing cores, the Xeon-Phi coprocessor is a truly multicore architecture, which consists of an interconnection speed among cores of 240 GB/s, two levels of cache memory, a theoretical performance of 1.01 Tflops, and programming flexibility, all making the Xeon-Phi an excellent coprocessor for parallelizing applications that seek to reduce computational times. The objective of this work is to migrate a geophysical application designed to directly calculate the gravimetric tensor components … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 18 publications
0
6
0
Order By: Relevance
“…With respect to the computing architecture, it is well known that a deep acquaintance of the architecture leads to a very efficient parallel solution to the problem, which in turn becomes complicated because it requires programming skills at low level and a large development (coding) time. This is the case of using CUDA C for Graphics Processing Units (GPU) or Message Passing Interface (MPI) for memory distributed architectures like clusters or Xeon Phi Co-processors [48,52,53]. As one of the goals of this paper is to make this research accessible to the greatest possible part of the geophysical and related communities, the acceleration of the herein presented algorithms are based on shared memory CPU-based architectures.…”
Section: Parallel Implementation Of Forward Modellingmentioning
confidence: 99%
See 2 more Smart Citations
“…With respect to the computing architecture, it is well known that a deep acquaintance of the architecture leads to a very efficient parallel solution to the problem, which in turn becomes complicated because it requires programming skills at low level and a large development (coding) time. This is the case of using CUDA C for Graphics Processing Units (GPU) or Message Passing Interface (MPI) for memory distributed architectures like clusters or Xeon Phi Co-processors [48,52,53]. As one of the goals of this paper is to make this research accessible to the greatest possible part of the geophysical and related communities, the acceleration of the herein presented algorithms are based on shared memory CPU-based architectures.…”
Section: Parallel Implementation Of Forward Modellingmentioning
confidence: 99%
“…Thus, a deficient design of the parallel strategy is prone to a deficient performance. OpenMP has benefited several geophysical problems [48,52]. OpenMP also provides an implicit parallelism model which produces medium granularity tasks for MT applications, providing a higher computational abstraction than MPI.…”
Section: Parallel Implementation Of Forward Modellingmentioning
confidence: 99%
See 1 more Smart Citation
“…An advantage of the usage of the presented CPML ABC consists in a drastic reduction of the number of memory arrays in the bidimensional algorithm [4], so it can be easily implemented for GPU processing, as those cards possess a limited amount of memory. The CPML ABC for the FDTD method requires to allocate the value of the time derivatives in a memory variable, which is implemented in two ways in this paper: the first, by allocating the memory variables for all the domain, and the second, by allocating them only in the absorption region (see Figure 2 [19,20].…”
Section: International Journal Of Antennas and Propagationmentioning
confidence: 99%
“…To divide tasks in a balanced way, we follow the procedure described by Couder-Castañeda et al [17], and Arroyo et al [18]. Let p n be the number of MPI processes and C n the number of problems to solve, then we define the problem number with which a process, p, must start and finish as p s and p e , respectively.…”
Section: Mpi Distributed Implementationmentioning
confidence: 99%