The bionic-based electronic nose (e-nose) and electronic tongue (e-tongue) show satisfactory performance in flavor analysis. Traditional flavor analysis of the e-nose and e-tongue systems focuses on data fusion, and the effects of the bionic characteristics on the flavor analysis performance are rarely studied. Motivated by this, a method, including an olfactory-taste synesthesia model (OTSM) and a convolutional neural network-random forest (CNN-RF), is proposed for the effective identification of flavor substances. The OTSM is developed for human nerve conduction mechanisms to enhance the bionic characteristics of the e-nose and e-tongue systems and is combined with a CNN-RF model for flavor identification. The results show that, first, when stimulated by e-nose and e-tongue data, physiological 1/f characteristics and synchronization are shown using the OTSM. The enhancement effects on the bionic characteristics of the fusion system are validated using the 1/f characteristics and synchronization. Second, the fully connected layer for the CNN is replaced by RF to improve the identification performance of flavor substances. Finally, CNN-RF is evaluated in comparison with other flavor recognition models and ablation studies to confirm its effectiveness. By comparison, the best recognition performance, including the accuracies of 96.67%, 96.67%, and 95.00%, the F1-scores of 96.65%, 96.66%, and 94.95%, and the kappa coefficients of 96.03%, 96.10%, and 93.44%, for five beers, five apples, and four mixed solutions, respectively, is obtained by CNN-RF. In conclusion, excellent flavor identification for the fusion system is achieved using the OTSM and CNN-RF models.