2009 International Conference on Parallel Processing 2009
DOI: 10.1109/icpp.2009.44
|View full text |Cite
|
Sign up to set email alerts
|

Mapping the FDTD Application to Many-Core Chip Architectures

Abstract: Abstract-This paper reports a study of mapping the Finite Difference Time Domain (FDTD) application to the IBM Cyclops-64 (C64) many-core chip architecture [1]. C64 is chosen for this study as it represents the current trend in computer architecture to develop a class of many-core architectures with distinct features e.g. software manageable on-chip memory hierarchy (vs. a hardware-managed data cache), high on-chip bandwidth, fine grain multithreading and synchronization, among others.Major results of our stud… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
27
0

Year Published

2010
2010
2015
2015

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 43 publications
(28 citation statements)
references
References 20 publications
1
27
0
Order By: Relevance
“…In addition, other transformations such as tiling of stencil computations for multicore architectures have been addressed in [43], [25], [21], [34]. Recently, memory customization for stencils has been proposed in [36].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition, other transformations such as tiling of stencil computations for multicore architectures have been addressed in [43], [25], [21], [34]. Recently, memory customization for stencils has been proposed in [36].…”
Section: Related Workmentioning
confidence: 99%
“…FDTD 2D This kernel is the core computation in the widely used Finite Difference Time Domain method in Computational Electromagnetics [34] Rician Denoise 2D This application performs noise removal from MRI images and involves an iterative loop that performs a sequence of stencil operations.…”
Section: Jacobi 1/2/3dmentioning
confidence: 99%
“…Cyclops-64 has been described extensively in previous publications [10,16,5]. Cyclops-64 was chosen for our experiments because its large number of execution units allow excellent studies in scalability and parallelism for HPC programs.…”
Section: Many-core Architecture Usedmentioning
confidence: 99%
“…In cache aware time skewing schemes, flat parallelization strategies are applied [11,12,18]. The cache sizes are known, so it is clear when it is better to parallelize the execution of the sub-tiles, forcing them into different caches, and when to leave them in the same cache for better data locality and process them sequentially with a single thread.…”
Section: Parallelism and Localitymentioning
confidence: 99%