Currently, efficient and accurate key frame extraction for massive videos remains a challenge, especially in surveillance applications. To overcome the inaccuracy of traditional key frame extraction, this paper proposes a method for key frame extraction via analyzing scale and direction of motion trajectory on spatiotemporal slice (MTSS). The proposed method employs both local and global motion state changes of object as the metric to extract key frames and meanwhile gives a strategy which integrates coarse extraction with partial fine re‐extraction of spatiotemporal slices to capture these state changes. Firstly, coarse extraction of spatiotemporal slices is employed to detect motion segments. Then partial fine re‐extraction is performed on the motion segments to obtain MTSS. Finally, the frames at the inflexions of scale and direction of MTSS are extracted as key frames. The experimental results have demonstrated that the proposed method outperforms existing state‐of‐the‐art methods in terms of accuracy for various videos while requiring a comparable or even less computation time.