Abstract-The analysis of scenarios in which a number of microphones record the activity of speakers, such as in a round-table meeting, presents a number of computational challenges. For example, if each participant wears a microphone, speech from both the microphone's wearer (local speech) and from other participants (crosstalk) is received. The recorded audio can be broadly classified in four ways: local speech, crosstalk plus local speech, crosstalk alone and silence. We describe two experiments related to the automatic classification of audio into these four classes. The first experiment attempted to optimize a set of acoustic features for use with a Gaussian mixture model (GMM) classifier. A large set of potential acoustic features were considered, some of which have been employed in previous studies. The best-performing features were found to be kurtosis, "fundamentalness," and cross-correlation metrics. The second experiment used these features to train an ergodic hidden Markov model classifier. Tests performed on a large corpus of recorded meetings show classification accuracies of up to 96%, and automatic speech recognition performance close to that obtained using ground truth segmentation.
The human auditory system is able to separate acoustic mixtures in order to create a perceptual description of each sound source. It has been proposed that this is achieved by an auditory scene analysis (ASA) in which a mixture of sounds is parsed to give a number of perceptual streams, each of which describes a single sound source. It is widely assumed that ASA is a precursor of attentional mechanisms, which select a stream for attentional focus. However, recent studies suggest that attention plays a key role in the formation of auditory streams. Motivated by these findings, this paper presents a conceptual framework for auditory selective attention in which the formation of groups and streams is heavily influenced by conscious and subconscious attention. This framework is implemented as a computational model comprising a network of neural oscillators, which perform stream segregation on the basis of oscillatory correlation. Within the network, attentional interest is modeled as a Gaussian distribution in frequency. This determines the connection weights between oscillators and the attentional process, which is modeled as an attentional leaky integrator (ALI). Acoustic features are held to be the subject of attention if their oscillatory activity coincides temporally with a peak in the ALI activity. The output of the model is an "attentional stream," which encodes the frequency bands in the attentional focus at each epoch. The model successfully simulates a range of psychophysical phenomena.
Usability and user satisfaction are of paramount importance when designing interactive software solutions. Furthermore, the optimal design can be dependent not only on the task but also on the type of user. Evaluations can shed light on these issues; however, very few studies have focused on assessing the usability of semantic search systems. As semantic search becomes mainstream, there is growing need for standardised, comprehensive evaluation frameworks. In this study, we assess the usability and user satisfaction of different semantic search query input approaches (natural language and view-based) from the perspective of different user types (experts and casuals). Contrary to previous studies, we found that casual users preferred the form-based query approach whereas expert users found the graph-based to be the most intuitive. Additionally, the controlled-language model offered the most support for casual users but was perceived as restrictive by experts, thus limiting their ability to express their information needs.
Abstract:The impact of Crowdsourcing and citizen science activities on academia, businesses, governance and society has been enormous. This is more prevalent today with citizens and communities collaborating with organizations, businesses and authorities to contribute in a variety of manners, starting from mere data providers to being key stakeholders in various decision-making processes. The "Crowdsourcing for observations from Satellites" project is a recently concluded study supported by demonstration projects funded by European Space Agency (ESA). The objective of the project was to investigate the different facets of how crowdsourcing and citizen science impact upon the validation, use and enhancement of Observations from Satellites (OS) products and services. This paper presents our findings in a stakeholder analysis activity involving participants who are experts in crowdsourcing, citizen science for Earth Observations. The activity identified three critical areas that needs attention by the community as well as provides suggestions to potentially help in addressing some of the challenges identified.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.