This paper brings forth a learning-based visual saliency model method for detecting diagnostic diabetic macular edema (DME) regions of interest (RoIs) in retinal image. The method introduces the cognitive process of visual selection of relevant regions that arises during an ophthalmologist's image examination. To record the process, we collected eye-tracking data of 10 ophthalmologists on 100 images and used this database as training and testing examples. Based on analysis, two properties (Feature Property and Position Property) can be derived and combined by a simple intersection operation to obtain a saliency map. The Feature Property is implemented by support vector machine (SVM) technique using the diagnosis as supervisor; Position Property is implemented by statistical analysis of training samples. This technique is able to learn the preferences of ophthalmologist visual behavior while simultaneously considering feature uniqueness. The method was evaluated using three popular saliency model evaluation scores (AUC, EMD, and SS) and three quality measurements (classical sensitivity, specificity, and Youden's J statistic). The proposed method outperforms 8 state-of-the-art saliency models and 3 salient region detection approaches devised for natural images. Furthermore, our model successfully detects the DME RoIs in retinal image without sophisticated image processing such as region segmentation.
Humans can easily classify different kinds of objects whereas it is quite difficult for computers. As a hot and difficult problem, objects classification has been receiving extensive interests with broad prospects. Inspired by neuroscience, deep learning concept is proposed. Convolutional neural network (CNN) as one of the methods of deep learning can be used to solve classification problem. But most of deep learning methods, including CNN, all ignore the human visual information processing mechanism when a person is classifying objects. Therefore, in this paper, inspiring the completed processing that humans classify different kinds of objects, we bring forth a new classification method which combines visual attention model and CNN. Firstly, we use the visual attention model to simulate the processing of human visual selection mechanism. Secondly, we use CNN to simulate the processing of how humans select features and extract the local features of those selected areas. Finally, not only does our classification method depend on those local features, but also it adds the human semantic features to classify objects. Our classification method has apparently advantages in biology. Experimental results demonstrated that our method made the efficiency of classification improve significantly.
For many applications in graphics, design, and human computer interaction, it is essential to understand where humans look in a scene with a particular task. Models of saliency can be used to predict fixation locations, but a large body of previous saliency models focused on free-viewing task. They are based on bottom-up computation that does not consider task-oriented image semantics and often does not match actual eye movements. To address this problem, we collected eye tracking data of 11 subjects when they performed some particular search task in 1307 images and annotation data of 2,511 segmented objects with fine contours and 8 semantic attributes. Using this database as training and testing examples, we learn a model of saliency based on bottom-up image features and target position feature. Experimental results demonstrate the importance of the target information in the prediction of task-oriented visual attention.
With the emergence of powerful and low‐energy Internet of Things devices, deep learning computing is increasingly applied to resource‐constrained edge devices. However, the mismatch between hardware devices with low computing capacity and the increasing complexity of Deep Neural Network models, as well as the growing real‐time requirements, bring challenges to the design and deployment of deep learning models. For example, autonomous driving technologies rely on real‐time object detection of the environment, which cannot tolerate the extra latency of sending data to the cloud, processing and then sending the results back to edge devices. Many studies aim to find innovative ways to reduce the size of deep learning models, the number of Floating‐point Operations per Second, and the time overhead of inference. Neural Architecture Search (NAS) makes it possible to automatically generate efficient neural network models. The authors summarise the existing NAS methods on resource‐constrained devices and categorise them according to single‐objective or multi‐objective optimisation. We review the search space, the search algorithm and the constraints of NAS on hardware devices. We also explore the challenges and open problems of hardware NAS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.