Identifying key content from a video is essential for many security applications such as motion/action detection, person re-identification and recognition. Moreover, summarizing the key information from Child Sexual Exploitation Materials, especially videos, which mainly contain distinctive scenes including people’s faces is crucial to speed-up the investigation of Law Enforcement Agencies. In this paper, we present a video summarization strategy that combines perceptual hashing and face detection algorithms to keep the most relevant frames of a video containing people’s faces that may correspond to victims or offenders. Due to legal constraints to access Child Sexual Abuse datasets, we evaluated the performance of the proposed strategy during the detection of adult pornography content with the NDPI-800 dataset. Also, we assessed the capability of our strategy to create video summaries preserving frames with distinctive faces from the original video using ten additional short videos manually labeled. Results showed that our approach can detect pornography content with an accuracy of 84.15% at a speed of 8.05 ms/frame making this appropriate for realtime applications.