A prerequisite for using smart camera networks effectively is a precise extrinsic calibration of the camera sensors, either in a fixed coordinate system, or relatively to each other. For cameras with partly overlapping fields of view, the relative pose estimation may be directly performed on or assisted by the video content obtained during scene analysis. In typical conditions however (wide baseline, repetitive patterns, homogeneous appearance of pedestrians), the pose estimation is imprecise and very often is affected by large errors in weakly constrained areas of the field of view. In this work, we propose to rely on progressively stricter constraints on the feature association between the camera views, guided by a pedestrian detector and a re-identification algorithm respectively. The results show that the two strategies are effective in alleviating the ambiguity which is due to the similar appearance of pedestrians in such scenes, and in improving the relative pose estimation.