Moving towards Industry 4.0, the idea of human-robot interaction (HRI) and human-robot collaboration (HRC) has been popularized. To introduce more robots into the industries, risk-correlated issues would be always on the hook as robots are not as flexible as human. In fact, although robots can replace human workers in some of the dangerous tasks, still human safety is always the top priority for all industries. The most common way to safeguard the human was to isolate the working space of human workers and robots. To realize the idea of Industry 4.0, it is postulated to have the robots and cobots out of the cage to maximize productivity and efficiency. Hence, studies have been conducted with the attempts to free the robots from the isolated working space while preserve the safety of human operators. The present study seeks to explore the feasibility of transfer learning strategy — fine-tuning to human presence detection tasks as the base of practicing safe HRI. A custom image dataset with 1463 images was collected and separated into train, validation, and test set with a ratio of 70:20:10. Three RetinaNet object detection models with different backbone networks were fine-tuned with the acquired dataset to transfer the knowledge learned from source domain to the target domain, which is the human presence detection tasks. The result has shown that the RetinaNet_ResNet152-V1-FPN has the highest test AP of 74.4% with an inference speed of 13.09 FPS, suggesting that it is the best fine-tuned RetinaNet models. This study has demonstrated the feasibility of using fine-tuning as the strategy to train the object detection models, which can possibly act as the base for improving HRI applications via a deep learning visual-based method. In summary, the research has signified the uses of deep learning models to perform human presence detections and can be further extended for HRI safety applications.