This study proposes a novel framework for fish species classification that combines FRCNN (Faster Region-based Convolutional Neural Network), VGG16 (Visual Geometry Group 16), and SPPNet (Spatial Pyramid Pooling network). The proposed FRCNN-VGG16-SPPNet framework combines the strengths of FRCNN's fast object detection and localization, VGG16's convenient transfer learning and fast classification performance, and SPPNet's image processing flexibility and robustness in handling input images of any size. First, FRCNN is used to detect and extract target objects from images containing multiple objects. Subsequently, photos of various fish species at different scales are fed into VGG16-SPPNet, which performs basic feature extraction using transfer learning theory. SPPNet further processes the input images by performing pooling operations of different scales. Finally, VGG16 identifies important features to perform object classification. The proposed framework achieves higher accuracy compared to traditional single VGG16 models, particularly in classifying objects of different sizes, with an accuracy rate of 0.9318, which is 26% higher than traditional single VGG16 models. The proposed framework is efficient, convenient, reliable, and robust for object classification and has potential for various applications in image recognition and classification.