We present a novel, exemplar-based method for audio event detection based on non-negative matrix factorisation. Building on recent work in noise robust automatic speech recognition, we model events as a linear combination of dictionary atoms, and mixtures as a linear combination of overlapping events. The weights of activated atoms in an observation serve directly as evidence for the underlying event classes. The atoms in the dictionary span multiple frames and are created by extracting all possible fixed-length exemplars from the training data. To combat data scarcity of small training datasets, we propose to artificially augment the amount of training data by linear time warping in the feature domain at multiple rates. The method is evaluated on the Office Live and Office Synthetic datasets released by the AASP Challenge on Detection and Classification of Acoustic Scenes and Events.
This work examines the use of a Wireless Acoustic Sensor Network (WASN) for the classification of clinically relevant activities of daily living (ADL) of elderly people. The aim of this research is to automatically compile a summary report about the performed ADLs which can be easily interpreted by caregivers. In this work, the classification performance of the WASN will be evaluated in both clean and noisy conditions. Results indicate that the classification performance of the WASN is 75.3±4.3% on clean acoustic data selected from the node receiving with the highest SNR. By incorporating spatial information extracted by the WASN, the classification accuracy further increases to 78.6±1.4%. In addition, the classification performance of the WASN in noisy conditions is in absolute average 8.1% to 9.0% more accurate compared to highest obtained single microphone results.
This paper gives an overview of research within the ALADIN project, which aims to develop an assistive vocal interface for people with a physical impairment. In contrast to existing approaches, the vocal interface is trained by the end-user himself, which means it can be used with any vocabulary and grammar, and that it is maximally adapted to the -possibly dysarthric -speech of the user. This paper describes the overall learning framework, the user-centred design and evaluation aspects, database collection and approaches taken to combat problems such as noise and erroneous input.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.