The traditional human behavior recognition algorithm is easy to ignore the spatial constraint problem of feature blocks, which leads to poor recognition effect and low correct rate. Therefore, we proposed a human motion target recognition algorithm based on Convolution Neural Network (referred to as the ''CNN'') and global constraint block matching. First, key frames of the human motion video were extracted, second, the local feature and global feature of key frames were analyzed, and CNN was used to perform feature fusion, then, according to the result of the feature fusion, a feature block was formed and the closest matching feature block is obtained, using the definition of spatial constraint, we considered the spatial data of human motion in the vertical direction, calculates the spatial constraint weight, and further completes the matching. Finally, the score of matching block and the spatial constraint weight were calculated, and the human motion targets are recognized based on the cumulative score. The experimental results show that the proposed algorithm has a high key frame extraction accuracy of more than 90% and less time consumed in feature fusion, high matching accuracy of feature blocks of more than 80%, and high feature blocks, the F-measure of human behavior recognition is 0.95 on average, and the overall recognition performance is good.INDEX TERMS Convolutional neural network, global constraint block, feature block, spatial constraints, matching, human motion target, recognition.