The recent trend toward the development of IoT architectures has entailed the transformation of the standard camera networks into smart multi-device systems capable of acquiring, elaborating, and exchanging data and, often, dynamically adapting to the environment. Along this line, this work proposes a novel distributed solution that guarantees the real-time monitoring of 3D indoor structured areas and also the tracking of multiple targets, by employing a heterogeneous visual sensor network composed of both fixed and Pan-Tilt-Zoom (PTZ) cameras. The fulfillment of the twofold mentioned goal was ensured through the implementation of a distributed game-theory-based algorithm, aiming at optimizing the controllable parameters of the PTZ devices. The proposed solution is able to deal with the possible conflicting requirements of high tracking precision and maximum coverage of the surveilled area. Extensive numerical simulations in realistic scenarios validated the effectiveness of the outlined strategy.