2014 43rd International Conference on Parallel Processing 2014
DOI: 10.1109/icpp.2014.28
|View full text |Cite
|
Sign up to set email alerts
|

CASTA: CUDA-Accelerated Static Timing Analysis for VLSI Designs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…Due to the threading overhead and irregular computational patterns of STA, performance of CPU-based multi-threading usually saturates at around 8-16 threads [4,8]. To break the performance bottleneck, GPU acceleration for timing analysis is further explored [8,17]. Wang et al [17] proposed acceleration techniques for the look-up table interpolation when computing the cell delays during the timing propagation, while the other steps like net delay and levelization are still on CPU.…”
Section: Parallel Static Timing Analysis Enginesmentioning
confidence: 99%
See 1 more Smart Citation
“…Due to the threading overhead and irregular computational patterns of STA, performance of CPU-based multi-threading usually saturates at around 8-16 threads [4,8]. To break the performance bottleneck, GPU acceleration for timing analysis is further explored [8,17]. Wang et al [17] proposed acceleration techniques for the look-up table interpolation when computing the cell delays during the timing propagation, while the other steps like net delay and levelization are still on CPU.…”
Section: Parallel Static Timing Analysis Enginesmentioning
confidence: 99%
“…To break the performance bottleneck, GPU acceleration for timing analysis is further explored [8,17]. Wang et al [17] proposed acceleration techniques for the look-up table interpolation when computing the cell delays during the timing propagation, while the other steps like net delay and levelization are still on CPU. They demonstrated more than 10× speedup on the kernel propagation time over their CPU implementation.…”
Section: Parallel Static Timing Analysis Enginesmentioning
confidence: 99%
“…It needs to be noted that the computation patterns of slews and load capacitance are similar to that of the forward propagation. Recent works have investigated LUT-based cell delay computation and timing propagation on both ASIC and FPGA [24,25], while net delays are not considered and levelization is done on CPU. Our study on the state-of-the-art open-source timing engine OpenTimer [26] reveals that these two accelerated steps are not actually the runtime bottleneck [23], as shown in Figure 4.…”
Section: Static Timing Analysismentioning
confidence: 99%