The control of thermostats of a heating, ventilation, and air-conditioning (HVAC) system installed in commercial and residential buildings remains a pertinent problem in building energy efficiency and thermal comfort research. The ability to determine the number of people at a particular time in an area is imperative for energy efficiency in order to condition only occupied regions and thermally deficient regions. In this study of the best features comparison for detecting the number of people in an area, feature extraction techniques including wavelet scattering, wavelet decomposition, grey-level co-occurrence matrix (GLCM) and feature maps convolution neural network (CNN) layers were explored using thermal camera imagery. Specifically, the pretrained CNN networks explored are the deep residual (Resnet-50) and visual geometry group (VGG-16) networks. The discriminating potential of Haar, Daubechies and Symlets wavelet statistics on different distributions of data were investigated. The performance of VGG-16 and ResNet-50 in an end-to-end manner utilizing transfer learning approach was investigated. Experimental results showed the classification and regression trees (CART) model trained on only GLCM and Haar wavelet statistic features, individually achieved accuracies of approximately 80% and 84%, respectively, in the detection problem. Moreover, k-nearest neighbors (KNN) trained on the combined features of GLCM and Haar wavelet statistics achieved an accuracy of approximately 86%. In addition, the performance accuracy of the multi classification support vector machine (SVM) trained on deep features obtained from layers of pretrained ResNet-50 and VGG-16 was between 96% and 97%. Furthermore, ResNet-50 transfer learning outperformed the VGG-16 transfer learning model for occupancy detection using thermal imagery. Overall, the SVM model trained on features extracted from wavelet scattering emerged as the best performing classifier with an accuracy of 100%. A principal component analysis (PCA) on the wavelet scattering features proved that the first twenty (20) principal components achieved a similar accuracy level instead of training on the whole feature set to reduce the execution time. The occupancy detection models can be integrated into HVAC control systems for energy efficiency and security systems, and aid in the distribution of resources to people in an area.