Every high-rise building must meet construction requirements, i.e. it must have good safety to prevent unexpected events such as fire incident. To avoid the occurrence of a bigger fire, surveillance using closed circuit television (CCTV) videos is necessary. However, it is impossible for security forces to monitor for a full day. One of the methods that can be used to help security forces is deep learning method. In this study, we use two deep learning methods to detect fire hotspots, i.e. you only look once (YOLO) method and faster region-based convolutional neural network (faster R-CNN) method. The first stage, we collected 100 image data (70 training data and 30 test data). The next stage is model training which aims to make the model can recognize fire. Later, we calculate precision, recall, accuracy, and F1 score to measure performance of model. If the F1 score is close to 1, then the balance is optimal. In our experiment results, we found that YOLO has a precision is 100%, recall is 54.54%, accuracy is 66.67%, and F1 score is 0.70583667. While faster R-CNN has a precision is 87.5%, recall is 95.45%, accuracy is 86.67%, and F1 score is 0.913022.