Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques 2012
DOI: 10.1145/2370816.2370838
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing performance optimization of multicore chips and multichip nodes with data structure metrics

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
18
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 30 publications
(18 citation statements)
references
References 11 publications
0
18
0
Order By: Relevance
“…Extrae uses binutils [22] to obtain human-readable source code references for the memory accesses. Dynamicallyallocated variables are identified by their allocation call-stack 4 while static variables are referenced by their given name.…”
Section: Tracementioning
confidence: 99%
See 2 more Smart Citations
“…Extrae uses binutils [22] to obtain human-readable source code references for the memory accesses. Dynamicallyallocated variables are identified by their allocation call-stack 4 while static variables are referenced by their given name.…”
Section: Tracementioning
confidence: 99%
“…Regarding the PEBS hardware infrastructure, the metrics associated to the memory samples depend on the processor fam- 4 The call-stack is captured using the backtrace() call from glibc.…”
Section: Tracementioning
confidence: 99%
See 1 more Smart Citation
“…SLO uses a modified GCC compiler [23] to instrument every memory access instruction to collect data reuse information. Another compiler-based tool is MACPO [24]. MACPO uses profile feedback from UT's PerfExpert [25] tool to instrument memory accesses in problematic code regions using the LLVM compiler infrastructure [26].…”
Section: Related Workmentioning
confidence: 99%
“…Simulation-based data-centric tools, such as CPROF [23], MemSpy [27], ThreadSpotter [33], SLO [6,5], and MACPO [32], instrument some or all memory accesses and compute the approximate memory hierarchy response with a cache simulator. There are two drawbacks to simulation.…”
Section: Introductionmentioning
confidence: 99%