In the process of classifying fresh-cut flowers, the classification accuracy of the algorithm plays a vital role in the control of quality stability, uniformity, and price of fresh-cut flowers, while the classification speed of an algorithm determines the possibility of industrial application. Currently, research on fresh-cut flower classification focuses on the breakthrough of classification accuracy, ignoring the real-time processing speed of the terminal, which seriously affects the use of fresh-cut flower online classification technology. In this study, RGB images and depth information data for 434 rose flowers were collected using a binocular stereo depth camera. Combined with the actual production line classification environment, a set of data argumentation solutions was developed under the condition of limited samples. The architecture was established and optimized based on the ShuffleNet V2 network backbone unit, transfer learning was performed, and an appropriate attention mechanism was invoked to classify flowers of five specifications. The experimental results showed that the proposed network structure had a competitive advantage in terms of parameter quantity, classification speed, and accuracy compared with traditional networks without an attention mechanism and other lightweight networks. The classification accuracy on the 3-channel (RGB channel) flower dataset and the 4-channel (RGB and depth channel) flower datasets were 98.891% and 99.915%, respectively, and the overall prediction classification speed can reach 0.020 seconds per flower. Compared to the fresh-cut flower classification machines currently on the market, the speed of the proposed method has a great advantage. These advantages are of great significance for the design and development of fresh-cut flower classification and grading systems, and the proposed method is instructive for the identification and application of multichannel data in the future.