A conventional approach to learning object detectors uses fully supervised learning techniques which assumes that a training image set with manual annotation of object bounding boxes are provided. The manual annotation of objects in large image sets is tedious and unreliable. Therefore, a weakly supervised learning approach is desirable, where the training set needs only binary labels regarding whether an image contains the target object class. In the weakly supervised approach a detector is used to iteratively annotate the training set and learn the object model. We present a novel weakly supervised learning framework for learning an object detector. Our framework incorporates a new initial annotation model to start the iterative learning of a detector and a model drift detection method that is able to detect and stop the iterative learning when the detector starts to drift away from the objects of interest. We demonstrate the effectiveness of our approach on the challenging PASCAL 2007 dataset.
We propose a principled probabilistic formulation of object saliency as a sampling problem. This novel formulation allows us to learn, from a large corpus of unlabelled images, which patches of an image are of the greatest interest and most likely to correspond to an object. We then sample the object saliency map to propose object locations. We show that using only a single object location proposal per image, we are able to correctly select an object in over 42% of the images in the PASCAL VOC 2007 dataset, substantially outperforming existing approaches. Furthermore, we show that our object proposal can be used as a simple unsupervised approach to the weakly supervised annotation problem. Our simple unsupervised approach to annotating objects of interest in images achieves a higher annotation accuracy than most weakly supervised approaches.
Most existing approaches to training object detectors rely on fully supervised learning, which requires the tedious manual annotation of object location in a training set. Recently there has been an increasing interest in developing weakly supervised approach to detector training where the object location is not manually annotated but automatically determined based on binary (weak) labels indicating if a training image contains the object. This is a challenging problem because each image can contain many candidate object locations which partially overlaps the object of interest. Existing approaches focus on how to best utilise the binary labels for object location annotation. In this paper we propose to solve this problem from a very different perspective by casting it as a transfer learning problem. Specifically, we formulate a novel transfer learning based on learning to rank, which effectively transfers a model for automatic annotation of object location from an auxiliary dataset to a target dataset with completely unrelated object categories. We show that our approach outperforms existing state-of-the-art weakly supervised approach to annotating objects in the challenging VOC dataset.
The detection of human action in videos of busy natural scenes with dynamic background is of interest for applications such as video surveillance. Taking a conventional fully supervised approach, the spatio-temporal locations of the action of interest have to be manually annotated frame by frame in the training videos, which is tedious and unreliable. In this paper, for the first time, a weakly supervised action detection method is proposed which only requires binary labels of the videos indicating the presence of the action of interest. Given a training set of binary labelled videos, the weakly supervised learning (WSL) problem is recast as a multiple instance learning (MIL) problem. A novel MIL algorithm is developed which differs from the existing MIL algorithms in that it locates the action of interest spatially and temporally by globally optimising both interand intra-class distance. We demonstrate through experiments that our WSL approach can achieve comparable detection performance to a fully supervised learning approach, and that the proposed MIL algorithm significantly outperforms the existing ones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.