Microscopic observation of mosquito species, which is the basis of morphological identification, is a time-consuming and challenging process, particularly owing to the different skills and experience of public health personnel. We present deep learning models based on the well-known you-only-look-once (YOLO) algorithm. This model can be used to simultaneously classify and localize the images to identify the species of the gender of field-caught mosquitoes. The results indicated that the concatenated two YOLO v3 model exhibited the optimal performance in identifying the mosquitoes, as the mosquitoes were relatively small objects compared with the large proportional environment image. The robustness testing of the proposed model yielded a mean average precision and sensitivity of 99% and 92.4%, respectively. The model exhibited high performance in terms of the specificity and accuracy, with an extremely low rate of misclassification. The area under the receiver operating characteristic curve (AUC) was 0.958 ± 0.011, which further demonstrated the model accuracy. Thirteen classes were detected with an accuracy of 100% based on a confusion matrix. Nevertheless, the relatively low detection rates for the two species were likely a result of the limited number of wild-caught biological samples available. The proposed model can help establish the population densities of mosquito vectors in remote areas to predict disease outbreaks in advance.