Supervised object detection models require fully annotated data for training the network. However, labeling large datasets is a very time-consuming task, therefore, weakly supervised object detection (WSOD) is a substitute approach to fully supervised learning for the object detection task. Many methods have been proposed for WSOD to date, their performance is still lower than supervised approaches since WSOD is a very challenging task. The major problem with existing WSOD methods is partial object detection and false detection in an objects cluster with the same category. The majority of the methods on WSOD follow multiple instance learning approaches, which does not guarantee the completeness of detected objects. To address these issues, we propose a three-fold refinement strategy to proposals to learn complete instances. We generate class-specific localization maps by fused class activation maps obtained from fused complementary classification networks. These localization maps are used to amend the detected proposals from the instance classification branch (detection network). Deep reinforcement learning networks are proposed to learn decisive-agent and rectifying-agent based on policy gradient algorithm to further refine the proposals. The refined bounding boxes are then fed to instance classification network. The refinement operations result in learning complete objects and greatly improve detection performance. Experimental results show better detection performance by the proposed WSOD method compared to the state-of-the-art methods on PASCAL VOC2007 and VOC2012 benchmarks.INDEX TERMS Weakly supervised object detection, complementary learning, discriminative features, proposal refinement, class activation maps, reinforcement learning, and deep learning.