Feature representation is important for human action recognition. Recently, Wang et al. [25] proposed dense trajectory (DT) based features for action video representation and achieved state-of-the-art performance on several action datasets. In this paper, we improve the DT method in two folds. Firstly, we introduce a motion boundary based dense sampling strategy, which greatly reduces the number of valid trajectories while preserves the discriminative power. Secondly, we develop a set of new descriptors which describe the spatial-temporal context of motion trajectories. To evaluate the performance of the proposed methods, we conduct extensive experiments on three benchmarks including K-TH, YouTube and HMDB51. The results show that our sampling strategy significantly reduces the computational cost of point tracking without degrading performance. Meanwhile, we achieve superior performance than the state-of-the-art methods by utilizing our spatial-temporal context descriptors.