Object detection system in light of deep learning have been monstrously effective in complex item identification task images and have shown likely in an extensive variety of genuine applications counting the Coronavirus pandemic. Ensuring and enforcing the proper use of face masks is one of the main obstacles in containing and reducing the spread of the infection among the population. This paper aims to find out how the urban population of a megacity uses facial masks correctly. Using YOLOv3 and YOLOv5, we trained and validated a brand-new dataset to identify images as "with mask", "without mask", and "mask not in position". In the YOLOv3 we carried out three pre-trained models which are: YOLOv3, YOLOv3-tiny, and SPP-YOLOv3. In addition, we utilized five pre-trained models in the YOLOv5: YOLOv5n, YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x. The dataset is included 6550 pictures with three classes. On mAP, the dataset achieved a commendable 95% performance accuracy. This research can be used to monitor the proper use of face masks in various public spaces through automated scanning.