Systolic-array based regularized QR-decomposition for IEEE 802.11n compliant soft-MMSE detection

Senning, Christian; Staudacher, A.; Burg, Andreas

doi:10.1109/icm.2010.5696169

Cited by 13 publications

(7 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The comparison of implementation results of our works ([2] and the proposed architectures) and other works ( [3], [5], [6], [7]) is summarized in Table I. From this table, the through put and normalized throughput of the proposed architecture is much higher than other MMSE-SQRD works.…”

Section: Implementation Resultsmentioning

confidence: 90%

“…11 n standard requirement in MMSE-SQRD works. Although the gate efficiency of the proposed architecture is approximately the same level as compared with previous SQRD [3] and MMSE-QRD [7] works, the proposed architecture is more useful than previous works because of good BER performance of MMSE-SQRD [1]. Therefore, the proposed architecture is suitable for high speed MIMO WLAN systems.…”

Section: Implementation Resultsmentioning

confidence: 95%

“…However, previous high-speed QRD/SQRD architectures can not be applied to MMSE detec tion algorithm because MMSE algorithm uses an augmented channel matrix taking into account the noise variance. Al though MMSE-QRD and MMSE-SQRD architectures have been previously proposed in [5], [6], [7], these architecture cannot be used in high throughput MIMO systems due to low operation speed which can not satisfied new wireless LAN standard such as IEEE802.l 1n.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Sorted QR decomposition for high-speed MMSE MIMO detection based wireless communication systems

Yuya

Nagao

Kurosaki

et al. 2012

2012 IEEE International Symposium on Circuits and Systems

View full text Add to dashboard Cite

In this paper, we propose a hardware architecture of high-speed sorted QR decomposition for a 4x4 MMSE MIMO decoder. A QR decomposition (QRD) is commonly used in many MIMO detection algorithms. In particular, a sorted QR decomposition (SQRD) is a advanced algorithm to improve a MIMO detection performance. The proposed architecture can decompose an augmented channel matrix for MMSE detection by using modified Gram-Schmidt algorithm with pipelining and resource sharing processing. This architecture can be applied in a high-throughput MIMO-OFDM system such as IEEES02.11n which supports data throughput of up to 600Mbps. We imple ment the proposed architecture with 334k gates in 90nm CMOS technology. The proposed design can achieve a high performance of up to 50.0 million 4x4 SQRD operations per second with the maximum operating frequency of 300 MHz.

show abstract

Section: Implementation Resultsmentioning

confidence: 90%

Section: Implementation Resultsmentioning

confidence: 95%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Sorted QR decomposition for high-speed MMSE MIMO detection based wireless communication systems

Yuya

Nagao

Kurosaki

et al. 2012

2012 IEEE International Symposium on Circuits and Systems

View full text Add to dashboard Cite

show abstract

“…QR decomposition today for instance is applied in multiple-input/ multiple-output (MIMO) channel detection [5][6][7]. Suitable CORDIC implementations were investigated and optimized in a 40 nm CMOS technology [4].…”

Section: Qrd Decomposition Acceleratorsmentioning

confidence: 99%

Highly-Flexible and Optimized, Area: and Energy-Efficient Matrix Decomposition Accelerators based on Givens Algorithms and CORDIC Rotations

Vishnoi¹,

Noll²

2017

J Electr Electron Syst

View full text Add to dashboard Cite

IntroductionMatrix-decomposition and -factorization techniques are powerful tools of linear algebra and have applications in topics that span sciences. Specifically, QR, Eigen Value (EVD) and Singular Value (SVD) decomposition are applied in sophisticated digital signal processing for image compression, reconstruction and restoration as well as in decoding, space and noise filtering, parameter estimation and source separation in communications, to name just a few. Interesting and very successful research on systolic architectures for real-time implementations of matrix-decomposition algorithms has been conducted in the 80s and 90s of the last century [1,2]. It was found that in comparison to e.g., Householder-based approaches, Givens-and Jacobi-type algorithms feature a high degree of inherent parallelism and can be implemented by mapping the required rotations onto hardware-efficient Coordinate Rotate Digital Computer (CORDIC) operations [3]. However, during those times VLSI-CMOS technology allowed only for implementation of a single CORDIC processor per chip. Today's very-deep submicron CMOS technologies allow for the realization of high-throughput, low-energy CORDIC macros on a fraction of a square millimeter of silicon area and microwatts of power dissipation [4]. By this, the implementation of high-performance matrix-decomposition modules to be used as number crunching SoC processor sub-macros has become feasible.In this work, new architectures and a new methodology for the quantitative optimization of matrix-decomposition-processor SoCsub-macro implementations for the use in challenging application domains featuring a wide variety of requirements and specifications are elaborated. An attractive target architecture consists of a flexible software-controlled Application Specific Instruction Processor (ASIP) executing the matrix-decomposition algorithms and being accelerated by a large farm of dedicated CORDIC blocks. The architecture template should allow for the decomposition of real-as well as complex valued matrices. Floating-point capability can be achieved by operating the individual CORDIC blocks in block exponent arithmetic. In order to achieve maximum area and especially energy efficiency, the optimization has to be performed concurrently on all levels of CMOS design, from the algorithmic level down to the physical implementation level. Only by that, the interactions between decisions on the different design levels being imposed by the features of today's very-deep submicron CMOS technologies can be properly considered. A key element of this approach is the elaboration of quite accurate, parameterized algebraic cost models for area, throughput, latency and energy of CORDIC macros. QRD Decomposition AcceleratorsQR decomposition today for instance is applied in multiple-input/ multiple-output (MIMO) channel detection [5][6][7]. Suitable CORDIC implementations were investigated and optimized in a 40 nm CMOS technology [4]. Based on the costs of the elementary arithmetic CORDIC sub-functions, a MATLAB-ba...

show abstract

“…This important property about has been used [11], [35], [36] to develop fast algorithms to detect layered Alamouti STBC signals; but no corresponding hardware architectures have been proposed. The architectures given in [23]- [29], [37]- [41] can be applied to compute the QRD of ; but, they do not exploit the Alamouti structure in the and therefore do not perform efficient computations.…”

mentioning

confidence: 99%

Block-Wise QR-Decomposition for the Layered and Hybrid Alamouti STBC MIMO Systems: Algorithms and Hardware Architectures

Liu

Chiu

Liu

et al. 2014

IEEE Trans. Signal Process.

View full text Add to dashboard Cite

Unlike the channel matrix in the spatial division multiplexing (SDM) multiple-input multiple-output (MIMO) communication system, the equivalent channel matrix in the layered Alamouti space-time block coding (STBC) MIMO system comprised 2-by-2 Alamouti sub-blocks. One novel property, found by Sayed et al. about the QR-decomposition (QRD) of this equivalent channel matrix is that the produced -and -matrices are also matrices with Alamouti sub-blocks. Taking advantage of this property, we propose a new block-wise complex Givens rotation (BCGR) based algorithm and a triangular systolic array (TSA) to compute the QRD of the equivalent channel matrix in an Alamouti block by block manner. Implementation results reveal that our new TSA can compute QRDs of 4-by-4 equivalent channel matrices faster than any architecture that has been developed for the SDM MIMO system. This property of fast QRD makes our TSA very attractive for the layered Alamouti STBC MIMO system combined with the orthogonal frequency division multiplexing. Our new BCGR based approach can also be applied to the hybrid Alamouti STBC MIMO system, which is also a system with equivalent channel matrix consisting of Alamouti sub-blocks.Index Terms-Alamouti space-time block coding, CORDIC module, Givens rotation, multiple input multiple output system, QR-decomposition, triangular systolic array.

show abstract

Systolic-array based regularized QR-decomposition for IEEE 802.11n compliant soft-MMSE detection

Cited by 13 publications

References 7 publications

Sorted QR decomposition for high-speed MMSE MIMO detection based wireless communication systems

Sorted QR decomposition for high-speed MMSE MIMO detection based wireless communication systems

Highly-Flexible and Optimized, Area: and Energy-Efficient Matrix Decomposition Accelerators based on Givens Algorithms and CORDIC Rotations

Block-Wise QR-Decomposition for the Layered and Hybrid Alamouti STBC MIMO Systems: Algorithms and Hardware Architectures

Contact Info

Product

Resources

About