Annoying compression artifacts exist in most of lossy coded videos at low bit rates, which are caused by coarse quantization of transform coefficients or motion compensation from distorted frames. In this paper, we propose a compression artifact reduction approach that utilizes both the spatial and the temporal correlation to form multi-hypothesis predictions from spatio-temporal similar blocks. For each transform block, three predictions with their reliabilities are estimated, respectively. The first prediction is constructed by inversely quantizing transform coefficients directly, and its reliability is determined by the variance of quantization noise. The second prediction is derived by representing each transform block with a temporal auto-regressive (TAR) model along its motion trajectory, and its corresponding reliability is estimated from local prediction errors of the TAR model. The last prediction infers the original coefficients from similar blocks in non-local regions, and its reliability is estimated based on the distribution of coefficients in these similar blocks. Finally, all the predictions are adaptively fused according to their reliabilities to restore high-quality videos. The experimental results show that the proposed method can efficiently reduce most of the compression artifacts and improve both subjective and objective quality of block transform coded videos.