An intelligent detection and recognition model for the fish species from camera footage is urgently required
as fishery contributes to a large portion of the world economy, and these kinds of advanced models can
aid fishermen on a large scale. Such models incorporating a pick-and-place machine can be beneficial to
sorting different fish species in bulk without human intervention, significantly reducing costs for large-scale
fishing industries. Existing methods for detecting and recognizing fish species have many limitations,
such as limited scalability, detection accuracy, failure to detect multiple species, degraded performance at
a lower resolution, or pinpointing the exact location of the fish. Modifying the head of a compelling deep
learning model, namely VGG-16, with pre-trained weights, can be used to detect both the species of the
fish and find the exact location of the fish in an image by implementing a modified YOLO to incorporate
the bounding box regression head. We have proposed using the ESRGAN algorithm and the proposed
neural network to amplify the image resolution by a factor of 4. With this method, an overall detection
accuracy of 96.5% has been obtained. The experiment has been conducted based on a total of 9460
images spread across 9 species. After further improving the model, a pick-and-place machine could be
integrated to quickly sort the fish according to their species in different large-scale fish industries.