2008
DOI: 10.1145/1353534.1346317
|View full text |Cite
|
Sign up to set email alerts
|

Feedback-driven threading

Abstract: Extracting high-performance from the emerging Chip Multiprocessors (CMPs) requires that the application be divided into multiple threads. Each thread executes on a separate core thereby increasing concurrency and improving performance. As the number of cores on a CMP continues to increase, the performance of some multi-threaded applications will benefit from the increased number of threads, whereas, the performance of other multi-threaded applications will become limited by data-synchronization and off-chip ba… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
8
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 41 publications
(9 citation statements)
references
References 21 publications
0
8
0
Order By: Relevance
“…The single-core ECM model predicts lower and upper limits for the bandwidth pressure on all memory hierarchy levels. When the bandwidth capacity of one level is exhausted, performance starts to saturate [4]. On the Intel Sandy Bridge processor, where the only shared bandwidth resource is the main memory interface, this happens when the bandwidth pressure exceeds the practical limit as measured, e.g., by a suitable multi-threaded STREAM-like benchmark.…”
Section: The Ecm Model: Multicore Scalingmentioning
confidence: 99%
See 2 more Smart Citations
“…The single-core ECM model predicts lower and upper limits for the bandwidth pressure on all memory hierarchy levels. When the bandwidth capacity of one level is exhausted, performance starts to saturate [4]. On the Intel Sandy Bridge processor, where the only shared bandwidth resource is the main memory interface, this happens when the bandwidth pressure exceeds the practical limit as measured, e.g., by a suitable multi-threaded STREAM-like benchmark.…”
Section: The Ecm Model: Multicore Scalingmentioning
confidence: 99%
“…Finally, we show that the ECM and power models can be successfully used to describe the scaling and power behavior of a lattice-Boltzmann flow solver code. bandwidth limitation, which was previously addressed in a more phenomenological way [4]. Performance modeling is certainly not limited to the single node, and there are many examples of successful large-scale modeling efforts [5,6,7,8].…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…This lack of scalability is related to different software and hardware issues, such as data-synchronization, concurrent shared-memory accesses, off-chip bus saturation, and functional unit saturation. [2][3][4][5] Accordingly, different dynamic concurrency throttling (DCT) strategies have been proposed to decrease the number of executing threads of a parallel application according to its available scalability, making better use of hardware resources and reducing costs. 3,4,[6][7][8][9][10] Therefore, when DCT is applied to a given application with limited scalability, the number of active threads will likely be less than the number of available cores in the system.…”
Section: Introductionmentioning
confidence: 99%
“…[2][3][4][5] Accordingly, different dynamic concurrency throttling (DCT) strategies have been proposed to decrease the number of executing threads of a parallel application according to its available scalability, making better use of hardware resources and reducing costs. 3,4,[6][7][8][9][10] Therefore, when DCT is applied to a given application with limited scalability, the number of active threads will likely be less than the number of available cores in the system. In this scenario, the system will be underutilized, that is, cores and cache memories not being used by the application will be idle.…”
Section: Introductionmentioning
confidence: 99%