This paper presents a framework to accurately and non-intrusively detect the number of people in an environment and track their positions. Different from most of the previous studies, our system setup uses only ambient thermal sensors with low-resolution, using no multimedia resources or wearable sensors. This preserves user privacy in the environment, and requires no active participation by the users, causing no discomfort. We first develop multiple methods to estimate the number of people in the environment. Our experiments demonstrate that algorithm selection is very important, but with careful selection, we can obtain up to 100% accuracy when detecting user presence. In addition, we prove that sensor placement plays a crucial role in the system performance, where placing the sensor on the room ceiling yields to the best results. After accurately finding the number of people in the environment, we perform position tracking on the collected ambient data, which are thermal images of the space where there are multiple people. We consider position tracking as static activity detection, where the user's position does not change while performing activities, such as sitting, standing, etc. We perform efficient pre-processing on the data, including normalization and resizing, and then feed the data into well-known machine learning methods. We tested the efficiency of our framework (including the hardware and software setup) by detecting four static activities. Our results show that we can achieved up to 97.5% accuracy when detecting these static activities, with up to 100% class-wise precision and recall rates. Our framework can be very beneficial to several applications such as health-care, surveillance, and home automation, without causing any discomfort or privacy issues for the users.