“…First, we compare studies that are concerned with using AI in COVID-19 diagnosis through medical images. Based on this comparison, we observed that (i) a large number of studies have utilized CT scans and X-rays in their works [ 243 , 270 , 271 ], where few studies utilized lung US [ 55 , 66 , 272 ]; (ii) although X-ray chest scans are considered less sensitive than PCR tests in detection of COVID-19 at the early stages, it is recommended for monitoring and evaluating the progression of a patient’s status, especially with critical cases [ 215 ]; (iii) segmentation techniques that used to detect the infected region are primarily used in CT scans [ 273 ]; (iv) augmentation techniques that used to increase the size of the dataset are commonly used with X-ray datasets [ 274 ]; (v) the majority of COVID-19 studies utilized CNN in their classification process [ 52 , 275 ], where some of them integrate CNN and transfer learning to overcome the shortage of the available dataset and increase the accuracy of the model [ 32 , 201 , 276 ]; (vi) a small number of studies augmented CNN with random forest and support vector machines to make feature extraction and classification [ 277 , 278 ]; (vii) higher accuracy reported from studies that augmented CNN, transfer learning, and SVM, where using CNN and DL are reported to overfit in some studies due to the shortage of available datasets [ 37 , 162 ]; (viii) accuracy of diagnosis using X-rays in diagnosis is approximately equal to the accuracy when using CT chest scans; (ix) the sensitivity of X-ray in diagnosis is highly correlated with the difference between the time of the initial symptoms and the procedural images;—it was not more than 55% after 2 days from the initial symptoms and increased to 79% after 11 days from the symptom onset [ 147 ]; (x) VGG, MobileNet, and ResNet are the most commonly pre-trained models employed for the classification tasks [ 21 , 52 ]; (xi) explainability of CNN model have been rarely used in clarifying the results of CNN [ 57 ]; and (xii) most of the studies reported accuracies of more than 90% for the binary classification tasks (i.e., COVID-19, non-COVID-19) [ 218 , 279 ], and reported accuracies higher than 80% for three classification tasks (i.e., normal, viral pneumonia, and COVID-19) [ 216 , …”