Occupancy-driven application research has been active research for a decade that focuses on improving or replacing new building infrastructure to improve building energy efficiency. Existing approaches for HVAC energy saving are putting more emphasis on occupancy detection, estimation, and localization to trade-off between energy consumption and thermal comfort satisfaction. In a non-intrusive approach, various sensors, actuators, and analytic data methods are commonly used to process data from occupant surroundings and trigger appropriate action to achieve the task. However, the performance of the non-intrusive approach reported in the literature is relatively poor due to the lack of quality of dataset used in model training and expropriate choice of machine learning model. This study proposed a non-intrusive approach that to improve the collection and quality of dataset using data pre-processing. The study collected a training dataset using various sensors installed in the building and developed a model using five machine learning models to determine occupant’s presence and estimate their number in the building. The proposed solution is tested in the living room with a prototype system integrated with various sensors designed to obtain occupant surrounding environmental datasets. The model’s prediction results obtained indicate that it is possible for the proposed solution to obtain data, process, and predict the occupant number with high accuracy (73.6 -99.7% using random forest).