Detecting, localizing and tracking humans within an industrial environment are three tasks which are of central importance towards achieving automation in workplaces and intelligent environments. This is because unobtrusive, real-time and reliable person tracking provides valuable input to solving problems such as workplace surveillance and event/activity recognition and, also, contributes to safety and optimized use of resources. This paper presents a passive approach to the problem of person tracking that is based on a network of conventional color cameras. The proposed approach exhibits robustness to challenging conditions that are encountered in industrial environments due to illumination artifacts, occlusions and the highly dynamic nature of the observed scenes. The multiple views of the environment that the system employs are used to obtain a volumetric representation of the humans within it, in real-time. Although human tracking can be achieved based solely on such a volumetric representation, in demanding scenes, this information is not enough to recover from tracking failures. Thus, in this work, we collect and update a representation of the color appearance of the persons in the environment. The combination of volumetric and color information reinforces tracking robustness, even when a person is not visible by any of the cameras for extended time intervals. The proposed approach has been extensively evaluated in comparison with an existing state of the art method and pertinent results are reported.