2006 IEEE International Symposium on Circuits and Systems
DOI: 10.1109/iscas.2006.1692603
|View full text |Cite
|
Sign up to set email alerts
|

Triangular systolic array with reduced latency for QR-decomposition of complex matrices

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
38
0

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 66 publications
(40 citation statements)
references
References 1 publication
0
38
0
Order By: Relevance
“…Our proposed QR decomposition hardware was implemented using TSMC 0.25 µm technology and the experimental results are compared with the TACR/TSA based hardware [9] and low complexity QR decomposition architecture [12]. Table 1 with a 3-iteration step lookahead unit, the proposed SSL-CORDIC based architecture needs 3 cycles with 19.60 nsec clock (total 58.80 nsec) to finish QR decomposition, however, 3 cycles with 27.10 nsec clock (total 81.30 nsec) is spent in the Compact CORDIC based one [12].…”
Section: Numerical Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our proposed QR decomposition hardware was implemented using TSMC 0.25 µm technology and the experimental results are compared with the TACR/TSA based hardware [9] and low complexity QR decomposition architecture [12]. Table 1 with a 3-iteration step lookahead unit, the proposed SSL-CORDIC based architecture needs 3 cycles with 19.60 nsec clock (total 58.80 nsec) to finish QR decomposition, however, 3 cycles with 27.10 nsec clock (total 81.30 nsec) is spent in the Compact CORDIC based one [12].…”
Section: Numerical Resultsmentioning
confidence: 99%
“…Triangular systolic arrays (TSA) [9] are frequently used for matrix inversion with the properties Manuscript received Nov. 30, 2010; revised Feb. 28, 2011. School of Electrical Engineering, Korea University, Korea E-mail : jongsun@korea.ac.kr of inherent parallelism and pipelining.…”
Section: Introductionmentioning
confidence: 99%
“…In [9], triangular systolic array (TSA) and three angle complex rotation (TACR) algorithm are used together to implement QR decomposition hardware. The TACR-based approach significantly reduces the QR decomposition time compared to the conventional Givens rotation-based architecture, however, it still needs large number of clock cycles to fill up the TSAbased architecture with inputs.…”
Section: Introductionmentioning
confidence: 99%
“…A low complexity complex Givens rotation approach, which we will refer to LC-CGR in the following [10], is also proposed by separating complex elements into real and imaginary parts. As a result, the delay can be reduced to almost half in case of 2 × 2 matrix compared to TSA/TACR approach [9]. However, as the matrix size increases, the latency will become much larger than TSA/TACR.…”
Section: Introductionmentioning
confidence: 99%
“…Architectures which employ the Gram-Schmidt [5] and conventional Givens rotations (CGR) [6] algorithms are disadvantaged as they require high-complexity squareroot operations. Whilst the shift-and-add processing nature of CORDIC-based matrix inversion [7] offers lowcomplexity hardware implementation, its inherent latency can preclude it from high-performance applications [8]. Squared Givens rotations (SGR) [9] offer square-root free processing and a number of SGR-based matrix inversion architectures have been proposed [10], [11].…”
Section: Introductionmentioning
confidence: 99%