This paper proposes a novel tool detection and tracking approach using uncalibrated monocular surgical videos for computer-aided surgical interventions. We hypothesize surgical tool end-effector to be the most distinguishable part of a tool and employ state-of-the-art object detection methods to learn the shape and localize the tool in images. For tracking, we propose a Product of Tracking Experts (PoTE) based generalized object tracking framework by probabilistically-merging tracking outputs (probabilistic/non-probabilistic) from timevarying numbers of trackers. In the current implementation of PoTE, we use three tracking experts -point-feature-based, region-based and object detection-based. A novel point featurebased tracker is also proposed in the form of a voting based bounding box geometry estimation technique building upon point-feature correspondences. Our tracker is causal which makes it suitable for real-time applications. This framework has been tested on real surgical videos and is shown to significantly improve upon the baseline results.