2013
DOI: 10.1145/2400682.2400686
|View full text |Cite
|
Sign up to set email alerts
|

Fast asymmetric thread synchronization

Abstract: For most multi-threaded applications, data structures must be shared between threads. Ensuring thread safety on these data structures incurs overhead in the form of locking and other synchronization mechanisms. Where data is shared among multiple threads these costs are unavoidable. However, a common access pattern is that data is accessed primarily by one dominant thread, and only very rarely by the other, non-dominant threads. Previous research has proposed biased locks, which are optimized for a single domi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 20 publications
0
7
0
Order By: Relevance
“…An alternative is to pass a token (usually an integer) the server can use to decide what to execute [3]. This avoids function pointers and thus enables the compiler to optimize away the function call for every request, but this did not show performance benefits in our experiments because the other synchronization overheads on the tested processors dominate the overhead of a function call.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…An alternative is to pass a token (usually an integer) the server can use to decide what to execute [3]. This avoids function pointers and thus enables the compiler to optimize away the function call for every request, but this did not show performance benefits in our experiments because the other synchronization overheads on the tested processors dominate the overhead of a function call.…”
Section: Methodsmentioning
confidence: 99%
“…Remote Core Locking (RCL) [10] is an efficient implementation of the server approach over shared memory. Cleary et al [3] take a similar approach, but apply it to asymmetric synchronization, where one thread executes the CS much more often then the others. Suleman et al [19] propose delegation over dedicated hardware and evaluate how much chip real estate should be used for the server core.…”
Section: Related Workmentioning
confidence: 99%
“…We assume that shared data is accessed only inside CSes, which holds for the concurrent objects we evaluate. A more conservative use of memory fences would be necessary when this is not the case [9]. To obtain the best possible performance, we augment all of the implementations with a simple interface that allows a thread to send a unique opcode of the CS to the servicing thread, rather than a function pointer.…”
Section: Methodology and Setupmentioning
confidence: 99%
“…If contention is high, this results in a substantial performance increase over classic locks. We can identify two approaches that exploit this idea: the client-server approach [9,17], and the combiner approach [10,11,13,24].…”
Section: Critical Sections Over CC Shared Memorymentioning
confidence: 99%
“…Works on thread synchronization can be commonly found at the system level implementations, where operating system or special firmware coordinates and synchronizes threads execution to ensure the overall applications' functionality and performance [6] [7] [8].…”
Section: Related Workmentioning
confidence: 99%