Modern face detection algorithms fail to provide optimal results when they have to deal with larger amounts of data per frame while processing higher quality videos. This paper tackles that problem and offers a solution to deploy commercially used state-of-the-art face detection algorithms to process only the regions of interest in a frame, and discard the rest to decrease the data to be processed. The model maintains the accuracy of the base algorithm while decreasing the processing time per frame, thereby increasing the overall efficiency. The selection of region of interest is dependent on the detection of facial window in the previous frame. Therefore, the choice of base algorithm plays an important role in determining the speed of the framework. The model achieves increased processing speeds of about 69–76% more than the standalone usage of the detection algorithms for analyzed frame rates.