With the development of society, deep learning has been widely used in object detection, face recognition, speech recognition, and other fields. Among them, object detection is a popular direction in computer vision and digital image processing, and face detection is a focus of this hot direction. Although face detection technology has gone through a long research stage, it is still considered as one of the more difficult subjects in human feature detection technology. In addition, the face detection technology itself has two sides, imperceptibility and complexity of the environment, and other defects cause the existing technology to be unable to accurately recognize faces of different proportions, obscured and different postures. Therefore, this paper adopts an advanced deep learning method based on machine vision to detect human faces automatically. In order to accurately detect a variety of human faces, a multiscale fast RCNN method based on upper and lower layers (UPL-RCNN) is proposed. The network is composed of spatial affine transformation components and feature region components (ROI). This method plays a vital role in face detection. First of all, multiscale information can be grouped in detection, so as to deal with small areas of the face. Then, the method can use the inspiration of the human visual system to perform contextual reasoning and spatial transformation, including zooming, cutting, and rotating. Through comparative experiments, the analysis results show that this method can not only accurately detect human faces but also has better performance than fast RCNN. Compared with some advanced methods, this method has the advantages of high accuracy, less time consumption, and no correlation mark.