The efficient and precise identification of cracks in masonry stone structures caused by natural or human-induced factors within a specific region holds significant importance in detecting damage and subsequent secondary harm. In recent times, remote sensing technologies have been actively employed to promptly identify crack regions during repair and reinforcement activities. Enhanced image resolution has enabled more accurate and sensitive detection of these areas. This research presents a novel approach utilizing deep learning techniques for crack area detection in cellphone images, achieved through segmentation and object detection methods. The developed model, named the CAM-K-SEG segmentation model, combines Grad-CAM visualization and K-Mean clustering approaches with pre-trained convolutional neural network models. A comprehensive dataset comprising photographs of numerous historical buildings was utilized for training the model. To establish a comparative analysis, the widely used U-Net segmentation model was employed. The training and testing datasets for the developed technique were meticulously annotated and masked. The evaluation of the results was based on the Intersection-over-Union (IoU) metric values. Consequently, it was concluded that the CAM-K-SEG model exhibits suitability for object recognition and localization, whereas the U-Net model is well-suited for crack area segmentation.