Visual multimedia have become an inseparable part of our digital social lives, and they often capture moments tied with deep affections. Automated visual sentiment analysis tools can provide a means of extracting the rich feelings and latent dispositions embedded in these media. In this work, we explore how Convolutional Neural Networks (CNNs), a now de facto computational machine learning tool particularly in the area of Computer Vision, can be specifically applied to the task of visual sentiment prediction. We accomplish this through fine-tuning experiments using a state-of-the-art CNN and via rigorous architecture analysis, we present several modifications that lead to accuracy improvements over prior art on a dataset of images from a popular social media platform. We additionally present visualizations of local patterns that the network learned to associate with image sentiment for insight into how visual positivity (or negativity) is perceived by the model.
Deep learning algorithms base their success on building high learning capacity models with millions of parameters that are tuned in a data-driven fashion. These models are trained by processing millions of examples, so that the development of more accurate algorithms is usually limited by the throughput of the computing devices on which they are trained. In this work, we explore how the training of a state-of-the-art neural network for computer vision can be parallelized on a distributed GPU cluster. The effect of distributing the training process is addressed from two different points of view. First, the scalability of the task and its performance in the distributed setting are analyzed. Second, the impact of distributed training methods on the final accuracy of the models is studied.Peer ReviewedPostprint (published version
Characterizing the genetic structure of large cohorts has become increasingly important as genetic studies extend to massive, increasingly diverse biobanks. Popular methods decompose individual genomes into fractional cluster assignments with each cluster representing a vector of DNA variant frequencies. However, with rapidly increasing biobank sizes, these methods have become computationally intractable. Here we present Neural ADMIXTURE, a neural network autoencoder that follows the same modeling assumptions as the current standard algorithm, ADMIXTURE, while reducing the compute time by orders of magnitude surpassing even the fastest alternatives. One month of continuous compute using ADMIXTURE can be reduced to just hours with Neural ADMIXTURE. A multi-head approach allows Neural ADMIXTURE to offer even further acceleration by computing multiple cluster numbers in a single run. Furthermore, the models can be stored, allowing cluster assignment to be performed on new data in linear time without needing to share the training samples.
This paper explores the potential for using Brain Computer Interfaces (BCI) as a relevance feedback mechanism in contentbased image retrieval. Several experiments are performed using a rapid serial visual presentation (RSVP) of images at different rates (5Hz and 10Hz) on 8 users with different degrees of familiarization with BCI and the dataset. We compare the feedback from the BCI and mouse-based interfaces in a subset of TRECVid images, finding that, when users have limited time to annotate the images, both interfaces are comparable in performance. Comparing our best users in a retrieval task, we found that EEG-based relevance feedback can outperform mouse-based feedback. Categories and Subject Descriptors MOTIVATIONThe exponential growth of visual content and its huge diversity has motivated considerable research on how documents can be retrieved according to user intentions when formulating a query.Advances in image processing and computer vision have provided tools for a perceptual and semantic interpretation of both the query and the indexed content. This has allowed the development of retrieval systems capable of processing queries by example and concepts.Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. The role of a human user during visual retrieval is critical, and his judgment about the correctness of the retrieved results can greatly speed up the search processes. This kind of relevance feedback has been demonstrated to significantly improve retrieval performance in image [10] and video [1] retrieval. Manually annotating images using a mouse, especially in a visual retrieval context can be tedious and mentally exhausting. In that such a scenario, EEG-based brain computer interfaces offer a potential solution as a mechanism to quickly annotate images. RELATED WORKEEG signals have been used for object detection in [2], where authors aim to detect airplanes in a dataset of satellite images from the city of London. The work in [3] expands the catalog of objects in very simple images, where the object on a black background occupies the whole image.EEG signals have been also used for image retrieval in [9], where authors used EEG relevance annotations to retrieve specific concepts in a complex dataset of keyframes from TRECVid 2005. However, while that work aimed at detecting concepts depicted by the whole image, we focus on the more challenging task of detecting a local object in a complex scenario. Another similar work [8], addresses the usage of EEG for image retrieval by formu...
This paper extends our previous work on the potential of EEGbased brain computer interfaces to segment salient objects in images. The proposed system analyzes the Event Related Potentials (ERP) generated by the rapid serial visual presentation of windows on the image. The detection of the P300 signal allows estimating a saliency map of the image, which is used to seed a semi-supervised object segmentation algorithm. Thanks to the new contributions presented in this work, the average Jaccard index was improved from 0.47 to 0.66 when processed in our publicly available dataset of images, object masks and captured EEG signals. This work also studies alternative architectures to the original one, the impact of object occupation in each image window, and a more robust evaluation based on statistical analysis and a weighted F-score.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.