Here we present an update of the studyforrest (http://studyforrest.org) dataset that complements the previously released functional magnetic resonance imaging (fMRI) data for natural language processing with a new two-hour 3 Tesla fMRI acquisition while 15 of the original participants were shown an audio-visual version of the stimulus motion picture. We demonstrate with two validation analyses that these new data support modeling specific properties of the complex natural stimulus, as well as a substantial within-subject BOLD response congruency in brain areas related to the processing of auditory inputs, speech, and narrative when compared to the existing fMRI data for audio-only stimulation. In addition, we provide participants' eye gaze location as recorded simultaneously with fMRI, and an additional sample of 15 control participants whose eye gaze trajectories for the entire movie were recorded in a lab setting—to enable studies on attentional processes and comparative investigations on the potential impact of the stimulation setting on these processes.
Here we present an update of the studyforrest (http://studyforrest.org) dataset that complements the previously released functional magnetic resonance imaging (fMRI) data for natural language processing with a new two-hour 3 Tesla fMRI acquisition while 15 of the original participants were shown an audio-visual version of the stimulus motion picture. We demonstrate with two validation analyses that these new data support modeling specific properties of the complex natural stimulus, as well as a substantial withinsubject BOLD response congruency in brain areas related to the processing of auditory inputs, speech, and narrative when compared to the existing fMRI data for audio-only stimulation. In addition, we provide participants' eye gaze location as recorded simultaneously with fMRI, and an additional sample of 15 control participants whose eye gaze trajectories for the entire movie were recorded in a lab setting -to enable studies on attentional processes and comparative investigations on the potential impact of the stimulation setting on these processes.
In contrast to ever increasing volumes of automatically generated data, human annotation capacities remain limited. Thus, fast active learning approaches that allow the efficient allocation of annotation efforts gain in importance. Furthermore, cost-sensitive applications such as fraud detection pose the additional challenge of differing misclassification costs between classes. Unfortunately, the few existing cost-sensitive active learning approaches rely on time-consuming steps, such as performing self-labelling or tedious evaluations over samples. We propose a fast, non-myopic, and cost-sensitive probabilistic active learning approach for binary classification. Our approach computes the expected reduction in misclassification loss in a labelling candidate's neighbourhood. We derive and use a closedform solution for this expectation, which considers the possible values of the true posterior of the positive class at the candidate's position, its possible label realisations, and the given labelling budget. The resulting myopic algorithm runs in the same linear asymptotic time as uncertainty sampling, while its non-myopic counterpart requires an additional factor of O(m · log m) in the budget size. The experimental evaluation on several synthetic and realworld data sets shows competitive or better classification performance and runtime, compared to several uncertainty sampling-and error-reduction-based active learning strategies, both in cost-sensitive and cost-insensitive settings.
Stream-based active learning (AL) strategies minimize the labeling effort by querying labels that improve the classifier’s performance the most. So far, these strategies neglect the fact that an oracle or expert requires time to provide a queried label. We show that existing AL methods deteriorate or even fail under the influence of such verification latency. The problem with these methods is that they estimate a label’s utility on the currently available labeled data. However, when this label would arrive, some of the current data may have gotten outdated and new labels have arrived. In this article, we propose to simulate the available data at the time when the label would arrive. Therefore, our method Forgetting and Simulating (FS) forgets outdated information and simulates the delayed labels to get more realistic utility estimates. We assume to know the label’s arrival date a priori and the classifier’s training data to be bounded by a sliding window. Our extensive experiments show that FS improves stream-based AL strategies in settings with both, constant and variable verification latency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.