The purpose of our work is to detect the target human concerned in video. For security considerations, event detection in video has potential economic and social needs. Human concerned object detecting is very helpful for event detection. In some emergency or special events, people will focus on specific object. We need locate human body and face, detect the sight direction, and determine the object they are looking. Firstly, we divide video into several clips which have same scales, according to weight to segment or merge clips. After video segmentation, we can get some region. What is the concerned object(region) for the people in video? So we need to detect the human and face. We use HOG feature and SVM to classify human with different pose, and detect them out. On this basis, face detection and tracking is performed. We apply yaw and pitch angle to present sight direction, in line with the weight of sight to determine human concerned region. For the complex scenes, we do an articulation judgment and give up the irrelevant background, and highlight the prospects. Experiments show that our method not only applicable to simple scenes, but also complex background scenes, and shows clear and accurate result.