In recent years, great progress has been made in recognizing human activities in complete image sequences. However, predicting human activity earlier in a video is still a challenging task. In this paper, a novel framework named weighted long short-term memory network (WLSTM) with saliency-aware motion enhancement (SME) is proposed for video activity prediction. First, a boundary-prior based motion segmentation method is introduced to use shortest geodesic distance in an undirected weighted graph. Next, a dynamic contrast segmentation strategy is proposed to segment the moving object in a complex environment. Then, the SME is constructed to enhance the moving object by suppressing irrelevant background in each frame. Moreover, an effective long-range attention mechanism is designed to further deal with the long-term dependency of complex non-periodic activities by automatically focusing more on the semantic critical frames instead of processing all sampled frames equally. Thus, the learned weights can highlight the discriminative frames and reduce the temporal redundancy. Finally, we evaluate our framework on the UT-Interaction and sub-JHMDB datasets. The experimental results show that WLSTM with SME statistically outperforms a number of state-of-the-art methods on both datasets.