2008
DOI: 10.1002/cpe.1335
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating application mapping scenarios on the Cell/B.E.

Abstract: SUMMARYApplications running on multicore platforms are difficult to program, and even more difficult to optimize, mainly due to (1) the several layers where the optimizations occur and (2) the multitude of available resources to be exploited in parallel. Although low-level optimizations only target code running on individual cores, high-level optimizations (e.g. data-and task-parallelism) target the overall application performance. In this paper, we focus on the latter, by evaluating possible mapping scenarios… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2009
2009
2014
2014

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 13 publications
0
4
0
Order By: Relevance
“…In this case, by executing the algorithm on the QS20 which allows to activate 16 SPE cores, it is achieved a speed‐up of almost 10 and 7 related to the execution of the algorithm on a single SPE core and a Pentium 4 processor. Varbanescu et al [11] performed an evaluation of different strategies for optimisation of applications for Cell BE architecture. Varbanescu et al [11] concluded that the more effective optimisation strategy is obtained by combining data‐ and task‐parallelism, in this case being obtained a speed‐up around 20 for a multimedia analysis application.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In this case, by executing the algorithm on the QS20 which allows to activate 16 SPE cores, it is achieved a speed‐up of almost 10 and 7 related to the execution of the algorithm on a single SPE core and a Pentium 4 processor. Varbanescu et al [11] performed an evaluation of different strategies for optimisation of applications for Cell BE architecture. Varbanescu et al [11] concluded that the more effective optimisation strategy is obtained by combining data‐ and task‐parallelism, in this case being obtained a speed‐up around 20 for a multimedia analysis application.…”
Section: Related Workmentioning
confidence: 99%
“…Varbanescu et al [11] performed an evaluation of different strategies for optimisation of applications for Cell BE architecture. Varbanescu et al [11] concluded that the more effective optimisation strategy is obtained by combining data‐ and task‐parallelism, in this case being obtained a speed‐up around 20 for a multimedia analysis application. An implementation optimised for the Cell BE architecture of grey‐level co‐occurrence matrices and Haralick texture features is presented in [12].…”
Section: Related Workmentioning
confidence: 99%
“…The main drawback of this method is the limited size of the LS (only 256 kbytes). However, in [20], the authors demonstrate that this approach is more efficient than creating as many threads as tasks in several orders of magnitude, and they implement an overlay-based technique to solve the LS limitation.…”
Section: Thread Creationmentioning
confidence: 99%
“…To optimize their applications, multicore programmers should design several data layouts of their applications, and choose the best mapping according to their target architectures [20]. In case of the Cell BE, this situation becomes much harder than conventional multicores due to the explicit control of DMA transfers and its heterogeneity.…”
Section: Dma Operationsmentioning
confidence: 99%