2010 International Conference on Field Programmable Logic and Applications 2010
DOI: 10.1109/fpl.2010.17
|View full text |Cite
|
Sign up to set email alerts
|

Parallelizing Simulated Annealing-Based Placement Using GPGPU

Abstract: Simulated annealing has became the de facto standard for FPGA placement engines since it provides high quality solutions and is robust under a wide range of objective functions. However, this method will soon become prohibitive due to its sequential nature and since the performance of single-core processor has stagnated.General purpose computing on graphics processing units (GPGPU) offers a promising solution to improve runtime with only commodity hardware. In this work, we develop a highly parallel approach t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0

Year Published

2011
2011
2021
2021

Publication Types

Select...
3
3
3

Relationship

1
8

Authors

Journals

citations
Cited by 28 publications
(25 citation statements)
references
References 10 publications
0
25
0
Order By: Relevance
“…• The first group of algorithms allows all processors to work within the same region (often the entire grid), but restricts swaps that are being evaluated in parallel, such as being from independent sets [6,[31][32][33]. Some algorithms employ a speculative move proposal to further accelerate the algorithm [6], and a dependency checker, executed serially, is used to ensure that no hard conflict has occurred and that soft conflicts are resolved with recalculation.…”
Section: Previous Workmentioning
confidence: 99%
“…• The first group of algorithms allows all processors to work within the same region (often the entire grid), but restricts swaps that are being evaluated in parallel, such as being from independent sets [6,[31][32][33]. Some algorithms employ a speculative move proposal to further accelerate the algorithm [6], and a dependency checker, executed serially, is used to ensure that no hard conflict has occurred and that soft conflicts are resolved with recalculation.…”
Section: Previous Workmentioning
confidence: 99%
“…While the algorithmic alterations made in [2] improved run-time by approximately 46% compared to [6], the maximum speed-up failed to scale for more than 16 threads. The main reason existing parallel placement annealers fail to scale is that the amount of sequential work in the approaches, including synchronization and communication, scales along with the problem size [1], [6], [2], causing the sequential runtime to quickly dominate the overall algorithm run-time as the number of threads increases due to Amdahl's Law. The second key issue with existing parallel annealing techniques is a significant reduction in quality-of-result as the amount of parallelism increases.…”
Section: Introductionmentioning
confidence: 98%
“…The most successful parallel annealing approaches to-date suffer from two main issues. The first issue is that of run-time scalability, where the run-time speedups of existing parallel annealers [1], [6], [2] do not scale beyond a small number of threads or processing cores. For example, in [6], the maximum speed-up obtained over a single thread was 21x, with a maximum speed-up of 17.5x over VPR [3]'s annealer.…”
Section: Introductionmentioning
confidence: 99%
“…While parallelization techniques [2], [3], [4] are ultimately required to dramatically reduce runtime and therefore bring appreciably change to user experience, it is recognized that they are often limited by Amdahl's law. This is exactly the case as in routing [5], [6], [7], which is well known to occupy a significant chunk of compilation time.…”
Section: Introductionmentioning
confidence: 99%