Appearance information alone is often not sufficient to accurately differentiate between fine-grained visual categories. Human experts make use of additional cues such as where, and when, a given image was taken in order to inform their final decision. This contextual information is readily available in many online image collections but has been underutilized by existing image classifiers that focus solely on making predictions based on the image contents.We propose an efficient spatio-temporal prior, that when conditioned on a geographical location and time, estimates the probability that a given object category occurs at that location. Our prior is trained from presence-only observation data and jointly models object categories, their spatiotemporal distributions, and photographer biases. Experiments performed on multiple challenging image classification datasets show that combining our prior with the predictions from image classifiers results in a large improvement in final classification performance.
Predicting all applicable labels for a given image is known as multi-label classification. Compared to the standard multi-class case (where each image has only one label), it is considerably more challenging to annotate training data for multi-label classification. When the number of potential labels is large, human annotators find it difficult to mention all applicable labels for each training image. Furthermore, in some settings detection is intrinsically difficult e.g. finding small object instances in high resolution images. As a result, multi-label training data is often plagued by false negatives. We consider the hardest version of this problem, where annotators provide only one relevant label for each image. As a result, training sets will have only one positive label per image and no confirmed negatives. We explore this special case of learning from missing labels across four different multi-label image classification datasets for both linear classifiers and end-to-end finetuned deep networks. We extend existing multi-label losses to this setting and propose novel variants that constrain the number of expected positive labels during training. Surprisingly, we show that in some cases it is possible to approach the performance of fully labeled classifiers despite training with significantly fewer confirmed labels.
Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. However, to date the vast majority of these approaches have restricted themselves to training on standard benchmark datasets such as ImageNet. We argue that fine-grained visual categorization problems, such as plant and animal species classification, provide an informative testbed for self-supervised learning. In order to facilitate progress in this area we present two new natural world visual classification datasets, iNat2021 and NeWT. The former consists of 2.7M images from 10k different species uploaded by users of the citizen science application iNaturalist. We designed the latter, NeWT, in collaboration with domain experts with the aim of benchmarking the performance of representation learning algorithms on a suite of challenging natural world binary classification tasks that go beyond standard species classification. These two new datasets allow us to explore questions related to large-scale representation and transfer learning in the context of finegrained categories. We provide a comprehensive analysis of feature extractors trained with and without supervision on ImageNet and iNat2021, shedding light on the strengths and weaknesses of different learned features across a diverse set of tasks. We find that features produced by standard supervised methods still outperform those produced by self-supervised approaches such as SimCLR. However, improved self-supervised learning methods are constantly being released and the iNat2021 and NeWT datasets are a valuable resource for tracking their progress.
Abstract:The peripheral retina of the human eye offers a unique opportunity for assessment and monitoring of ocular diseases. We have developed a novel wide-field (>70°) optical coherence tomography system (WF-OCT) equipped with wavefront sensorless adaptive optics (WSAO) for enhancing the visualization of smaller (<25°) targeted regions in the peripheral retina. We iterated the WSAO algorithm at the speed of individual OCT B-scans (~20 ms) by using raw spectral interferograms to calculate the optimization metric. Our WSAO approach with a 3 mm beam diameter permitted primarily low-but also high-order peripheral wavefront correction in less than 10 seconds. In preliminary imaging studies in five normal human subjects, we quantified statistically significant changes with WSAO correction, corresponding to a 10.4% improvement in average pixel brightness (signal) and 7.0% improvement in high frequency content (resolution) when visualizing 1 mm (~3.5°) Bscans of the peripheral (>23°) retina. We demonstrated the ability of our WF-OCT system to acquire non wavefront-corrected wide-field images rapidly, which could then be used to locate regions of interest, zoom into targeted features, and visualize the same region at different time points. A pilot clinical study was conducted on seven healthy volunteers and two subjects with prodromal Alzheimer's disease which illustrated the capability to image Drusen-like pathologies as far as 32.5° from the fovea in un-averaged volume scans. This work suggests that the proposed combination of WF-OCT and WSAO may find applications in the diagnosis and treatment of ocular, and potentially neurodegenerative, diseases of the peripheral retina, including diabetes and Alzheimer's disease.
Figure 1. Species distribution models describe the relationship between environmental conditions and (actual or potential) species presence. However, the link between the environment and species distribution data can be complex, particularly since distributional data comes in many different forms. Above are four different sources of distribution data for the Von Der Decken's Hornbill [11]: (from left to right) raw point observations, regional checklists, gridded ecological surveys, and data-driven expert range maps. All images are from Map of Life [101].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.