Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2014
DOI: 10.1145/2555243.2555273
|View full text |Cite
|
Sign up to set email alerts
|

Lock contention aware thread migrations

Abstract: On a cache-coherent multicore multiprocessor system, the performance of a multithreaded application with high lock contention is very sensitive to the distribution of application threads across multiple processors. This is because the distribution of threads impacts the frequency of lock transfers between processors, which in turn impacts the frequency of last-level cache (LLC) misses that lie on the critical path of execution. Inappropriate distribution of threads across processors increases LLC misses in the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 2 publications
0
1
0
Order By: Relevance
“…The communication is a common performance bottleneck for fine-grained parallel applications (Yoo et al, 2013b). The authors in (Pusukuri et al, 2014) discuss the techniques used for improving the network performance by reducing lock contention and overlapping communications. In (Jagtap et al, 2012) are analyzed the performance of the ROSS simulation framework on different platforms and the multi-threaded implementation is compared with the MPI-based version.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The communication is a common performance bottleneck for fine-grained parallel applications (Yoo et al, 2013b). The authors in (Pusukuri et al, 2014) discuss the techniques used for improving the network performance by reducing lock contention and overlapping communications. In (Jagtap et al, 2012) are analyzed the performance of the ROSS simulation framework on different platforms and the multi-threaded implementation is compared with the MPI-based version.…”
Section: Literature Reviewmentioning
confidence: 99%