2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA) 2014
DOI: 10.1109/hpca.2014.6835937
|View full text |Cite
|
Sign up to set email alerts
|

Improving GPGPU resource utilization through alternative thread block scheduling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
63
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 153 publications
(65 citation statements)
references
References 18 publications
2
63
0
Order By: Relevance
“…Eventually, the threads' working sets overflow the L2 cache, degrading performance. This effect has been shown in simulation, but simple analytic models do not take it into account [29], [34]. Numerous other non-obvious scaling results exist, but we do not detail them due to space constraints.…”
Section: High-level Gpgpu Modelmentioning
confidence: 93%
“…Eventually, the threads' working sets overflow the L2 cache, degrading performance. This effect has been shown in simulation, but simple analytic models do not take it into account [29], [34]. Numerous other non-obvious scaling results exist, but we do not detail them due to space constraints.…”
Section: High-level Gpgpu Modelmentioning
confidence: 93%
“…Prior research work [5,8], however, has shown that executing the maximum possible number of thread blocks on a streaming multiprocessor is not always the optimal choice from the performance perspective due to inefficient utilization of streaming multiprocessor resources. Indeed, when the thread block scheduler always assigns the maximum thread blocks to a streaming multiprocessor, it might cause a higher number of memory and interconnection network stalls.…”
Section: Proposed Warp Schedulermentioning
confidence: 99%
“…Previous works [5,8] pointed out the drawback of the existing thread block schedulers that maximizing the number of thread blocks assigned to a streaming multiprocessor is not always effective -i.e. increasing the number of thread blocks does not necessarily improve performance.…”
Section: Related Workmentioning
confidence: 99%
“…Besides replacement policy, thread scheduling is considered as another critical approach for improving the cache usage efficiency [13,14,15,16]. Through the rearrangement of thread execution order, the re-reference interval of the data of each thread can be dynamically reconfigured, and the cache quota can also adapt to the needs of the running threads.…”
Section: Introductionmentioning
confidence: 99%