Several natural and human factors are responsible for the defacement of the external walls and tiles of buildings, and the related deterioration can be a public safety hazard. Therefore, active building maintenance and repair processes are essential for ensuring building sustainability. However, conventional inspection methods are time-, cost-, and labor-intensive processes. Therefore, herein, this study proposes a convolutional neural network (CNN) model for image-based automated detection and localization of key building defects (efflorescence, spalling, cracking, and defacement). Based on a pretrained CNN VGG-16 classifier, this model applies class activation mapping for object localization. After identifying its limitations in real-life applications, this study determined the model’s robustness and ability to accurately detect and localize defects in the external wall tiles of buildings. For real-time detection and localization, this study applied this model by using mobile devices and drones. The results show that the application of deep learning with UAV can effectively detect various kinds of external wall defects and improve the detection efficiency.