2012 International Conference on Embedded Computer Systems (SAMOS) 2012
DOI: 10.1109/samos.2012.6404154
|View full text |Cite
|
Sign up to set email alerts
|

Using OpenMP superscalar for parallelization of embedded and consumer applications

Abstract: Abstract-In the past years, research and industry have introduced several parallel programming models to simplify the development of parallel applications. A popular class among these models are task-based programming models which proclaim easeof-use, portability, and high performance. A novel model in this class, OpenMP Superscalar, combines advanced features such as automated runtime dependency resolution, while maintaining simple pragma-based programming for C/C++. OpenMP Superscalar has proven to be effect… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(13 citation statements)
references
References 10 publications
0
13
0
Order By: Relevance
“…Our current implementation has rules capturing a variety of flavors of both Map and Reduce patterns, as reported in Table 1. Table 1 presents our results for multithreaded code from the Starbench suite [1], omitting two programs, bodytrack and h264dec, which follow a pattern (pipeline) that is out of the scope of our current analysis. We found all but one expected patterns, known by manual inspection and from the literature, and understand the reason for missing one map in kmeans, which will inform subsequent versions.…”
Section: Parallel Pattern Findingmentioning
confidence: 99%
“…Our current implementation has rules capturing a variety of flavors of both Map and Reduce patterns, as reported in Table 1. Table 1 presents our results for multithreaded code from the Starbench suite [1], omitting two programs, bodytrack and h264dec, which follow a pattern (pipeline) that is out of the scope of our current analysis. We found all but one expected patterns, known by manual inspection and from the literature, and understand the reason for missing one map in kmeans, which will inform subsequent versions.…”
Section: Parallel Pattern Findingmentioning
confidence: 99%
“…However, our selective task replication heuristics are applicable for other taskparallel dataflow platforms. Nevertheless, the performance of OmpSs+Nanos is on par with the highly optimized commercial and open source implementation of OpenMP [5], [6] and it has successfully served as a pilot platform to push dataflow task parallelism to OpenMP 4.0. In the case of the distributed OmpSs+MPI model, it combines dataflow execution with the message passing model providing significant performance benefits.…”
Section: Task Replicationmentioning
confidence: 99%
“…Listing 1 shows a simplified example of OmpSs programming, extracted from H.264 macroblock wavefront decoding [10]. The function decode() is called inside a nested loop, processing the elements of matrix X.…”
Section: ) Programming Modelmentioning
confidence: 99%