Today’s deep learning architectures, if trained with proper dataset, can be used for object detection in marine search and rescue operations. In this paper a dataset for maritime search and rescue purposes is proposed. It contains aerial-drone videos with 40,000 hand-annotated persons and objects floating in the water, many of small size, which makes them difficult to detect. The second contribution is our proposed object detection method. It is an ensemble composed of a number of the deep convolutional neural networks, orchestrated by the fusion module with the nonlinearly optimized voting weights. The method achieves over 82% of average precision on the new aerial-drone floating objects dataset and outperforms each of the state-of-the-art deep neural networks, such as YOLOv3, -v4, Faster R-CNN, RetinaNet, and SSD300. The dataset is publicly available from the Internet.