This work presents a novel framework providing the ability to control an Unmanned Aerial System (UAS) while detecting objects in real-time with visible detections, containing class names, bounding boxes, and confidence scores, in a changeable high-fidelity sea simulation environment, where the major attributes like the number of human victims and debris floating, ocean waves and shades, weather conditions such as rain, snow, and fog, sun brightness and intensity, camera exposure and brightness can easily be manipulated. Developed using Unreal Engine, Microsoft Air-Sim, and Robot Operating System (ROS), the framework was firstly used to find the best possible configuration of the UAS flight altitude, and camera brightness with high average prediction confidence of human victim detection, and then only autonomous real-time test missions were carried out to calculate the accuracies of two pretrained You Only Look Once Version 7 (YOLOv7) models: YOLOv7 retrained on SeaDronesSee Dataset (YOLOv7-SDS) and YOLOv7 originally trained on Microsoft COCO Dataset (YOLOv7-COCO), which resulted in high values of 97.8% and 93.79%, respectively. Furthermore, it is proposed that the framework developed in this study can be reverse engineered for autonomous real-time training with automatic ground-truth labeling of the images from the gaming engine that already has all the details of all objects placed in the environment for rendering them onto the screen. This is required to be done to avoid the cumbersome and timeconsuming manual labeling of large amount of synthetic data that can be extracted using this framework which could be a groundbreaking achievement in the field of maritime computer vision.