Abstract. We have created a large diverse set of cars from overhead images 1 , which are useful for training a deep learner to binary classify, detect and count them. The dataset and all related material will be made publically available. The set contains contextual matter to aid in identification of difficult targets. We demonstrate classification and detection on this dataset using a neural network we call ResCeption. This network combines residual learning with Inception-style layers and is used to count cars in one look. This is a new way to count objects rather than by localization or density estimation. It is fairly accurate, fast and easy to implement. Additionally, the counting method is not car or scene specific. It would be easy to train this method to count other kinds of objects and counting over new scenes requires no extra set up or assumptions about object locations.
We develop a set of methods to improve on the results of self-supervised learning using context. We start with a baseline of patch based arrangement context learning and go from there. Our methods address some overt problems such as chromatic aberration as well as other potential problems such as spatial skew and mid-level feature neglect. We prevent problems with testing generalization on common self-supervised benchmark tests by using different datasets during our development. The results of our methods combined yield top scores on all standard selfsupervised benchmarks, including classification and detection on PASCAL VOC 2007, segmentation on PASCAL VOC 2012, and "linear tests" on the ImageNet and CSAIL Places datasets. We obtain an improvement over our baseline method of between 4.0 to 7.1 percentage points on transfer learning classification tests. We also show results on different standard network architectures to demonstrate generalization as well as portability. All data, models and programs are available at: https://gdo-datasci. llnl.gov/selfsupervised/.
We explore the application of computer vision and machine learning (ML) techniques to predict material properties (e.g., compressive strength) based on SEM images of material microstructure. We show that it is possible to train ML models to predict materials performance based on SEM images alone, demonstrating this capability on the real-world problem of predicting uniaxially compressed peak stress of consolidated molecular solids (i.e. TATB) samples. Our image-based ML approach reduces root mean square error (RMSE) by an average of 51% over a non-image-based baseline. We compared two complementary approaches to this problem: (1) a traditional ML approach, random forest (RF), using state-ofthe-art computer vision features and (2) an end-to-end deep learning (DL) approach, where features are learned automatically from raw images. We demonstrate the complementarity of these approaches, showing that RF performs best in the "small data" regime in which many real-world scientific applications reside (up to 24% lower RMSE than DL), whereas DL outpaces RF in the "big data" regime, where abundant training samples are available (up to 24% lower RMSE than RF).
We propose a computational model of contour integration for visual saliency. The model uses biologically plausible devices to simulate how the representations of elements aligned collinearly along a contour in an image are enhanced. Our model adds such devices as a dopamine-like fast plasticity, local GABAergic inhibition and multi-scale processing of images. The fast plasticity addresses the problem of how neurons in visual cortex seem to be able to influence neurons they are not directly connected to, for instance, as observed in contour closure effect. Local GABAergic inhibition is used to control gain in the system without using global mechanisms which may be non-plausible given the limited reach of axonal arbors in visual cortex. The model is then used to explore not only its validity in real and artificial images, but to discover some of the mechanisms involved in processing of complex visual features such as junctions and end-stops as well as contours. We present evidence for the validity of our model in several phases, starting with local enhancement of only a few collinear elements. We then test our model on more complex contour integration images with a large number of Gabor elements. Sections of the model are also extracted and used to discover how the model might relate contour integration neurons to neurons that process end-stops and junctions. Finally, we present results from real world images. Results from the model suggest that it is a good current approximation of contour integration in human vision. As well, it suggests that contour integration mechanisms may be strongly related to mechanisms for detecting end-stops and junction points. Additionally, a contour integration mechanism may be involved in finding features for objects such as faces. This suggests that visual cortex may be more information efficient and that neural regions may have multiple roles.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.