Video is considered as one of the most useful and important forms of multimedia data, that is usually used in several applications. Despite its importance, video indexing and retrieval becomes a challenging task. In order to reduce the amount of data and keep only relevant frames, keyframe extraction becomes necessary in a content‐based video retrieval (CBVR) system. In this paper, a keyframe extraction method is proposed based on the face image quality for video surveillance systems. Data is reduced by rejecting frames without faces. Then, face images are clustered by identity. After that, a set of candidate frames is selected to be proceeded. The face quality assessment is based on four metrics including pose estimation, sharpness, brightness and resolution, and the frame with the best face quality is considered as a keyframe. Experimental tests were carried on several datasets in order to prove the efficiency of authors' method compared with state‐of‐the‐art approaches.