Visual-based human action recognition is currently one of the most active research topics in computer vision. The feature representation directly has a crucial impact on the performance of the recognition. Feature representation based on bag-of-words is popular in current research, but the spatial and temporal relationship among these features is usually discarded. In order to solve this issue, a novel feature representation based on normalized interest points is proposed and utilized to recognize the human actions. The novel representation is called super-interest point. The novelty of the proposed feature is that the spatial-temporal correlation between the interest points and human body can be directly added to the representation without considering scale and location variance of the points by introducing normalized points clustering. The novelty concerns three tasks. First, to solve the diversity of human location and scale, interest points are normalized based on the normalization of the human region. Second, to obtain the spatial-temporal correlation among the interest points, the normalized points with similar spatial and temporal distance are constructed to a superinterest point by using three-dimensional clustering algorithm. Finally, by describing the appearance characteristic of the super-interest points and location relationship among the super-interest points, a new feature representation is gained. The proposed representation formation sets up the relationship among local features and human¯gure. Experiments on Weizmann, KTH, and UCF sports dataset demonstrate that the proposed feature is e®ective for human action recognition.