In the field of human-robot interaction (HRI), detection, tracking and re-identification of humans in a robot's surroundings are crucial tasks, e. g. for socially compliant robot navigation. Besides the 3D position detection, the estimation of a person's upper-body orientation based on monocular camera images is a challenging problem on a mobile platform. To obtain real-time position tracking as well as upper-body orientation estimations, the proposed system comprises discriminative detectors whose hypotheses are tracked by a Kalman filter-based multihypotheses tracker. For appearance-based person recognition, a generative approach, based on a 3D shape model, is used to refine these tracked hypotheses. This model evaluates edges and color-based discrimination from the background. Furthermore, for each person the texture of his or her upper-body is learned and used for person re-identification. When computational resources are limited, the update rate of the model-based optimization reduces itself automatically. Thereby the estimation accuracy decreases, but the system keeps tracking the persons around the robot in real-time. The person's 3D pose is tracked up to a distance of 5.0 meters with an average Euclidean error of 18 cm. The achieved motion independent average upper-body orientation error is 22 • . Furthermore, the upper-body texture is learned on-line which allowed a stable person re-identification in our experiments.