Food quality detection is of great importance for human health and industrial production. Currently, the common detection methods are difficult to achieve the need for fast, accurate, and non-destructive detection. In this work, an electronic nose (E-nose) detection method based on the combination of Convolutional Neural Network combined with Wavelet Scattering Network (CNN-WSN) and Improved Seahorse Optimizes Kernel Extreme Learning Machine (ISHO-KELM) is proposed for identifying the quality level of a variety of food products. In the feature extraction part, the abstract features of Convolutional Neural Network (CNN) are fused with the scattering features of Wavelet Scattering Network (WSN), and the obtained CNN-WSN fusion features can characterize the original information of the food quality effectively. In the classifier design and decision-making section, chaotic mapping is used to initialize the population in the Seahorse Optimisation Algorithm (SHO), avoiding the problem that SHO may fall into local optimal solutions. The kernel parameters and regularisation coefficients of the Kernel Extreme Learning Machine (KELM) model were then optimized by improving the locomotion, predation, and reproduction behaviors of the hippocampal populations, which solved the problem of the difficult selection of the key parameters in the model, and thus improved the accuracy and generalization of the overall model. To validate the effectiveness of the proposed food quality detection model, the E-nose system was first built and milk quality data were collected independently, and then tested on two publicly available food quality datasets as well as a self-collected milk quality dataset, respectively. The experimental results show that the food quality detection method proposed in this work has good quality assessment effect on different datasets.