2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA) 2014
DOI: 10.1109/isca.2014.6853215
|View full text |Cite
|
Sign up to set email alerts
|

HELIX-RC: An architecture-compiler co-design for automatic parallelization of irregular programs

Abstract: Data dependences in sequential programs limit parallelization because extracted threads cannot run independently. Although thread-level speculation can avoid the need for precise dependence analysis, communication overheads required to synchronize actual dependences counteract the benefits of parallelization. To address these challenges, we propose a lightweight architectural enhancement co-designed with a parallelizing compiler, which together can decouple communication from thread execution. Simulations of t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
37
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 21 publications
(38 citation statements)
references
References 44 publications
1
37
0
Order By: Relevance
“…Campanoni et al [2012] extract parallelism by running iterations of a loop on separate threads, fulfilling loop-carried dependences using signals between threads. They also propose an architectural improvement to make their approach more feasible [Campanoni et al 2014]. Our approach tries to detect parallel patterns in sequential code and suggests a solution without any special architectural requirements other than those commonly satisfied.…”
Section: Related Workmentioning
confidence: 99%
“…Campanoni et al [2012] extract parallelism by running iterations of a loop on separate threads, fulfilling loop-carried dependences using signals between threads. They also propose an architectural improvement to make their approach more feasible [Campanoni et al 2014]. Our approach tries to detect parallel patterns in sequential code and suggests a solution without any special architectural requirements other than those commonly satisfied.…”
Section: Related Workmentioning
confidence: 99%
“…We used the second version of the HELIX compiler, HCCv2 [5], which is based on the ILDJIT compilation framework [2]. The sequential programs used as baseline were the unmodified versions of benchmarks, optimized (O3) and compiled by ILDJIT with LLVM 3.4.1 as the back end.…”
Section: Methodsmentioning
confidence: 99%
“…Contrary to common assumptions, there is considerable latent TLP even in non-numerical sequentially designed programs, e.g., between the iterations of a loop [5,3,4,20]. When such iterations can run unfettered on multiple cores in a modern processor, performance adjusts with the number of cores.…”
Section: Helix-upmentioning
confidence: 99%
“…HELIX-RC [3] proposes a ring-cache architecture to communicate register dependences. XLOOPS is potentially more elegant as it avoids requiring ISA extensions to specify the dependence communication unlike previous proposals.…”
Section: Related Workmentioning
confidence: 99%
“…Previous speculative parallelization techniques show promise but demand dramatic changes in the microarchitecture, compiler, and/or ISA. HELIX-RC [3] takes an alternative approach of decoupling memory dependence communication without employing speculation but relies on an aggressive parallelizing compiler. The XLOOPS ISA could be extended to include instructions for lane synchronization to benefit compiler optimizations as in HELIX-RC.…”
Section: Related Workmentioning
confidence: 99%