Abstract-We propose here to acquire high resolution sequences of a person's face using a pan-tilt-zoom (PTZ) network camera. This capability should prove helpful in forensic analysis of video sequences as frames containing faces are tagged, and within a frame, windows containing faces can be retrieved. The system starts in pedestrian detector mode, where the lens angle is set widest, and detects people using a pedestrian detector module. The camera then changes to the region of interest (ROI) focusing mode where the parameters are automatically tuned to put the upper body of the detected person, where the face should appear, in the field of view (FOV). Then, in the face detection mode, the face is detected using a face detector module, and the system switches to an active tracking mode consisting a control loop to actively follow the detected face with two different modules: a tracker to track the face in the image, and a camera control module to adjust the camera parameters. During this loop, our tracker learns online the face appearance in multiple views under all condition changes. It runs robustly at 15 fps and is able to reacquire the face of interest after total occlusion or leaving FOV. We compare our tracker with various state-of-the-art tracking methods in terms of precision and running time performance. Extensive experiments in challenging indoor and outdoor conditions are also demonstrated to validate the complete system.