CASTA: CUDA-Accelerated Static Timing Analysis for VLSI Designs

Wang, Hunta H.-W.; Lin, Louis Y.-Z.; Huang, Rongqin; Wen, Charles H.-P.

doi:10.1109/icpp.2014.28

Cited by 4 publications

(3 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Due to the threading overhead and irregular computational patterns of STA, performance of CPU-based multi-threading usually saturates at around 8-16 threads [4,8]. To break the performance bottleneck, GPU acceleration for timing analysis is further explored [8,17]. Wang et al [17] proposed acceleration techniques for the look-up table interpolation when computing the cell delays during the timing propagation, while the other steps like net delay and levelization are still on CPU.…”

Section: Parallel Static Timing Analysis Enginesmentioning

confidence: 99%

“…To break the performance bottleneck, GPU acceleration for timing analysis is further explored [8,17]. Wang et al [17] proposed acceleration techniques for the look-up table interpolation when computing the cell delays during the timing propagation, while the other steps like net delay and levelization are still on CPU. They demonstrated more than 10× speedup on the kernel propagation time over their CPU implementation.…”

Section: Parallel Static Timing Analysis Enginesmentioning

confidence: 99%

See 1 more Smart Citation

GPU-accelerated static timing analysis

Guo

Huang

Lin

2020

Proceedings of the 39th International Conference on Computer-Aided Design

View full text Add to dashboard Cite

Section: Parallel Static Timing Analysis Enginesmentioning

confidence: 99%

Section: Parallel Static Timing Analysis Enginesmentioning

confidence: 99%

GPU-accelerated static timing analysis

Guo

Huang

Lin

2020

Proceedings of the 39th International Conference on Computer-Aided Design

View full text Add to dashboard Cite

“…It needs to be noted that the computation patterns of slews and load capacitance are similar to that of the forward propagation. Recent works have investigated LUT-based cell delay computation and timing propagation on both ASIC and FPGA [24,25], while net delays are not considered and levelization is done on CPU. Our study on the state-of-the-art open-source timing engine OpenTimer [26] reveals that these two accelerated steps are not actually the runtime bottleneck [23], as shown in Figure 4.…”

Section: Static Timing Analysismentioning

confidence: 99%