where H is the N r x Nt flat fading channel matrix, s is the Nt x 1 transmitted symbol vector, x the N r x 1 received II. SYSTEM MODEL We consider fast Rayleigh fading channels. The MIMO system is described by Nt log2(M). Among many reduced complexity algorithms, the sphere decoding based algorithms provide best candidates for near-capacity performance [7] [9] [14]. The principle of sphere decoding algorithms (SDA) [2] is, instead of an exhaustive search in the lattice formed by all the possible received symbol vectors, it restricts the search to be within a small sphere so that the ML or near-ML solution can be found efficiently. It is well known that for uncoded data, SDA can achieve optimal or near-optimal performance. In iterative detection with channel decoding, however, standard hard decision SDA does not fit in. To produce reliable soft output, the list sphere decoding (LSD) algorithm in [7] applies the max-log approximation to MAP detection by selecting the best vector candidate searched by the sphere decoder. To ensure the reliability of the selected vector, the sphere decoder has to provide enough candidate vectors which requires a relatively large search radius. The subsequent soft detection also has to compare the likelihood metrics from the long list of candidates. Consequently, heavy computational complexity and/or large memories are required. The approach in [14] applies modified hard decision SDA after each iteration to find the exact best candidate. But the overall complexity is shown as similar to LSD. Besides, in general there are performance losses by the max-log approximation. In this paper, we show that with careful selection, combining the likelihood metrics of multiple candidate vectors instead of selecting only one for the max-log approximation can be more effective. An important feature is that a very short list for combining can be sufficient. As a result, for hard decision SDA, a smaller search radius can be used. The subsequent soft output detection also benefit from the much shorter list. Overall, both performance improvements and complexity reduction can be achieved.