Indoor human tracking and activity recognition are fundamental yet coherent problems for ambient assistive living. In this paper, we propose a method to address these two critical issues simultaneously. We construct a wireless sensor network (WSN), and the sensor nodes within WSN consist of pyroelectric infrared (PIR) sensor arrays. To capture the tempo-spatial information of the human target, the field of view (FOV) of each PIR sensor is modulated by masks. A modified partial filter algorithm is utilized to decode the location of the human target. To exploit the synergy between the location and activity, we design a two-layer random forest (RF) classifier. The initial activity recognition result of the first layer is refined by the second layer RF by incorporating various effective features. We conducted experiments in a mock apartment. The mean localization error of our system is about 0.85 m. For five kinds of daily activities, the mean accuracy for 10-fold cross-validation is above 92%. The encouraging results indicate the effectiveness of our system.