2012
DOI: 10.1117/1.jei.21.2.021116
|View full text |Cite
|
Sign up to set email alerts
|

Combining high productivity and high performance in image processing using Single Assignment C on multi-core CPUs and many-core GPUs

Abstract: In this paper the challenge of parallelization development of industrial high performance inspection systems is addressed concerning a conventional parallelization approach versus an auto-parallelized technique. Therefore, we introduce the functional array processing language Single Assignment C (SaC), which relies on a hardware virtualization concept for automated, parallel machine code generation for multicore CPUs and GPUs. Additional, software engineering aspects like programmability, productivity, underst… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

2
1
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 13 publications
2
1
0
Order By: Relevance
“…We conclude that, for the running example of all-pairs N -body simulation, SAC and its tool chain do exhibit the desired combination of high software engineering productivity and high execution performance. These findings are in line with several previous application studies [26][27][28].…”
Section: Discussionsupporting
confidence: 94%
See 1 more Smart Citation
“…We conclude that, for the running example of all-pairs N -body simulation, SAC and its tool chain do exhibit the desired combination of high software engineering productivity and high execution performance. These findings are in line with several previous application studies [26][27][28].…”
Section: Discussionsupporting
confidence: 94%
“…These findings are in line with several previous application studies [26][27][28]. These findings are in line with several previous application studies [26][27][28].…”
Section: Discussionsupporting
confidence: 93%
“…There is a need for One data locality approach in data parallel language compilers is to start from an imperative language with loops, and fuse the successive loops over the input image1 into an expression tree in a single loop, to improve cache locality and on chip register locality e.g. [6,12]. For CPU or GPU scheduling, this expression tree can be duplicated to apply the same fused computation on image chunks in a data parallel fashion.…”
Section: Eliminating Intermediate Buffers With Compiler Optimisationmentioning
confidence: 99%