We present an approach to effectively use millions of images with noisy annotations in conjunction with a small subset of cleanly-annotated images to learn powerful image representations. One common approach to combine clean and noisy data is to first pre-train a network using the large noisy dataset and then fine-tune with the clean dataset. We show this approach does not fully leverage the information contained in the clean set. Thus, we demonstrate how to use the clean annotations to reduce the noise in the large dataset before fine-tuning the network using both the clean set and the full set with reduced noise. The approach comprises a multi-task network that jointly learns to clean noisy annotations and to accurately classify images. We evaluate our approach on the recently released Open Images dataset, containing ∼9 million images, multiple annotations per image and over 6000 unique classes. For the small clean set of annotations we use a quarter of the validation set with ∼40k images. Our results demonstrate that the proposed approach clearly outperforms direct fine-tuning across all major categories of classes in the Open Image dataset. Further, our approach is particularly effective for a large number of classes with wide range of noise in annotations (20-80% false positive annotations).
It is well known in the photometric stereo literature that uncalibrated photometric stereo, where light source strength and direction are unknown, can recover the surface geometry of a Lambertian object up to a 3-parameter linear transform known as the generalized bas relief (GBR) ambiguity. Many techniques have been proposed for resolving the GBR ambiguity, typically by exploiting prior knowledge of the light sources, the object geometry, or non-Lambertian effects such as specularities. A less celebrated consequence of the GBR transformation is that the albedo at each surface point is transformed along with the geometry. Thus, it should be possible to resolve the GBR ambiguity by exploiting priors on the albedo distribution. To the best of our knowledge, the only time the albedo distribution has been used to resolve the GBR is in the case of uniform albedo. We propose a new prior on the albedo distribution : that the entropy of the distribution should be low. This prior is justified by the fact that many objects in the real-world are composed of a small finite set of albedo values.
We consider the problem of reconstructing the shape of a surface with an arbitrary, spatially varying isotropic bidirectional reflectance distribution function (BRDF), and introduce a novel, stratified photometric stereo method. By using a particular configuration of lights, it is possible to use symmetry in the image measurements resulting from BRDF isotropy to estimate at each point a plane containing the surface normal. For differentiable surfaces, this allows us to recover the isocontours of the depth map, but not the actual depth associated with each contour. The isocontour structure provides topological information about the surface (critical points, Reeb graph, etc.). By using additional cues in the image data or imposing additional constraints on the surface (e.g., shadows, specular highlights, Helmholtz Reciprocity, uniform BRDF), the unknown height of each isocontour can be estimated and the metric structure is resolved. We validate this technique on real and synthetic data by successfully recovering the isocontours of the depth map from images.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.