Background and purpose: Emotion classification models have focused on subject-but not crowd-level signal inputs. We proposed sound-based community emotion recognition (SCED) as a new machine learning challenge and have developed a novel deep learning-inspired forward-forward binary ternary pattern (FF-BTP)-based feature engineering model for crowd sentiment classification. Materials and Methods: From 187 YouTube videos, we curated a new dataset comprising 919, 905 and 909 (total 2733) 3-second recordings (sampling frequency 44.1 KHz) of crowd sounds (overlapping speech, environmental noise, etc.) belonging to verified dominant negative, neutral, and positive emotion classes. Our model architecture combined BTP-based textural feature extractor, which generated three feature vectors using signum, lower and upper ternary functions, with a novel Hinton's FF algorithm-inspired handcrafted feature vector selection function that identified the most distinctive feature vector based on maximum calculated mean square error. Our FF-BTP-based model also incorporated upstream multilevel discrete wavelet transform, which enabled multilevel feature generation in the spatial and frequency domains, and downstream iterative neighborhood component analysis-based feature selection and support vector machine classifier. Results: The FF-BTP-based model attained excellent 97.22% overall three-class classification accuracy on the SCED dataset. Conclusions: Our handcrafted feature engineering model emulated the deep feature characterization of deep learning models, and attained excellent classification results at low computation cost. It can be implemented for emerging SCED-related applications. INDEX TERMS FF-BTP; sound community emotion classification; sound processing; textural feature extraction.