In the past decade, a large number of computational models of visual saliency have been proposed. Recently a number of comprehensive benchmark studies have been presented, with the goal of assessing the performance landscape of saliency models under varying conditions. This has been accomplished by considering fixation data, annotated image regions, and stimulus patterns inspired by psychophysics. In this paper, we present a high-level examination of challenges in computational modeling of visual saliency, with a heavy emphasis on human vision and neural computation. This includes careful assessment of different metrics for performance of visual saliency models, and identification of remaining difficulties in assessing model performance. We also consider the importance of a number of issues relevant to all saliency models including scale-space, the impact of border effects, and spatial or central bias. Additionally, we consider the biological plausibility of models in stepping away from exemplar input patterns towards a set of more general theoretical principles consistent with behavioral experiments. As a whole, this presentation establishes important obstacles that remain in visual saliency modeling, in addition to identifying a number of important avenues for further investigation.
Recent advances in the field of saliency have concentrated on fixation prediction, with benchmarks reaching saturation. However, there is an extensive body of works in psychology and neuroscience that describe aspects of human visual attention that might not be adequately captured by current approaches. Here, we investigate singleton detection, which can be thought of as a canonical example of salience. We introduce two novel datasets, one with psychophysical patterns and one with natural odd-one-out stimuli. Using these datasets we demonstrate through extensive experimentation that nearly all saliency algorithms do not adequately respond to singleton targets in synthetic and natural images. Furthermore, we investigate the effect of training state-of-the-art CNN-based saliency models on these types of stimuli and conclude that the additional training data does not lead to a significant improvement of their ability to find odd-one-out targets.
A computational explanation of how visual attention, interpretation of visual stimuli, and eye movements combine to produce visual behavior, seems elusive. Here, we focus on one component: how selection is accomplished for the next fixation. The popularity of saliency map models drives the inference that this is solved, but we argue otherwise. We provide arguments that a cluster of complementary, conspicuity representations drive selection, modulated by task goals and history, leading to a hybrid process that encompasses early and late attentional selection. This design is also constrained by the architectural characteristics of the visual processing pathways. These elements combine into a new strategy for computing fixation targets and a first simulation of its performance is presented. A sample video of this performance can be found by clicking on the "Supplementary Files" link under the "Article Tools" heading.
We present a probabilistic framework to generate character animations based on weak control signals, such that the synthesized motions are realistic while retaining the stochastic nature of human movement. The proposed architecture, which is designed as a hierarchical recurrent model, maps each sub‐sequence of motions into a stochastic latent code using a variational autoencoder extended over the temporal domain. We also propose an objective function which respects the impact of each joint on the pose and compares the joint angles based on angular distance. We use two novel quantitative protocols and human qualitative assessment to demonstrate the ability of our model to generate convincing and diverse periodic and non‐periodic motion sequences without the need for strong control signals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.