Acoustic Scene Analysis Based on Hierarchical Generative Model of Acoustic Event Sequence

Imoto, Keisuke; Shimauchi, Suehiro

doi:10.1587/transinf.2016slp0004

Cited by 32 publications

(29 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Focusing on this idea, some researchers have employed the BoW representation in other research fields such as computer vision (bag-of-visual words) or acoustics (bag-of-acoustic words [7], [14], [19], bag-of-angle words [23]). Specifically, the bag-of-acoustic words is a discrete feature representation that quantizes the spectral features of sounds into acoustic words and aggregates acoustic words into a histogram of them.…”

Section: B Spatial-feature-based Bow Representation For Multichannelmentioning

confidence: 99%

“…Therefore, we also apply a generative Bayesian model of the spatial-feature-based BoW for modeling and classifying acoustic scenes, which enables the sparse modeling of acoustic scenes. Such a generative model has been proposed for the bag-of-acoustic words, which is called the supervised acoustic topic model (sATM) [7], and thus, we adapt this model for the spatial-feature-based BoW.…”

Section: Generative Model Of Bow For Acoustic Scene Classificationmentioning

confidence: 99%

“…We then extracted MFCCs as spectral acoustic features frame by frame or the bag-of-acoustic words sound clip by sound clip. As the acoustic scene classifiers, the GMM and the supervised acoustic topic model (sATM) [7] were used. As another method for acoustic scene classification utilizing the distributed microphone array, we also evaluated a classifier based on the late fusion-based classification method [19].…”

Section: Comparative Approaches For Acoustic Scene Classificationmentioning

confidence: 99%

“…The classification of acoustic scenes (cooking, vacuuming, watching TV, being on the bus, meeting) or acoustic events (footsteps, running water, voice) has recently become important for many applications such as monitoring elderly people [1], [2], automatic surveillance [3]- [5], automatic classification of life-logging [6], [7], and multimedia retrieval [8]- [10].…”

Section: Introductionmentioning

confidence: 99%

“…For instance, an acoustic scene "cooking" is characterized by a combination of multiple sound events including "running water," "cutting ingredients," and "heating a skillet." On the basis of this idea, Guo and Li [13], Kim et al [14], and Imoto and coworkers [7], [15] proposed acoustic scene classification methods based on the bag-of-acoustic words, which quantize the spectral features into acoustic words and aggregates acoustic words into a histogram of them.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Acoustic scene classification based on generative model of acoustic spatial words for distributed microphone array

Imoto

2017

2017 25th European Signal Processing Conference (EUSIPCO)

Self Cite

View full text Add to dashboard Cite

Abstract-In this paper, we propose an acoustic scene classification method for a distributed microphone array based on a combination of spatial information of multiple sound events. In the proposed method, each acoustic scene is characterized by a spatial information representation based on a bag-ofwords called the bag of acoustic spatial words. To calculate the bag-of-acoustic spatial words, spatial features extracted from multichannel observations are quantized and then aggregated over a sound clip, that is, each sound clip is regarded as a unit of a"document." Moreover, a supervised generative model relating acoustic scenes and bag-of-acoustic spatial words is also adapted, which enables robust acoustic scene classification. Experimental results using actual environmental sounds show that the proposed approach achieves more effective performance than the conventional acoustic scene classification approach not utilizing a combination of the spatial information of multiple sound events.

show abstract

Section: B Spatial-feature-based Bow Representation For Multichannelmentioning

confidence: 99%

Section: Generative Model Of Bow For Acoustic Scene Classificationmentioning

confidence: 99%

Section: Comparative Approaches For Acoustic Scene Classificationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Acoustic scene classification based on generative model of acoustic spatial words for distributed microphone array

Imoto

2017

2017 25th European Signal Processing Conference (EUSIPCO)

Self Cite

View full text Add to dashboard Cite

show abstract

Environmental sound processing and its applications

Miyazaki

Toda

Hayashi

et al. 2019

IEEJ Transactions Elec Engng

View full text Add to dashboard Cite

As part of the effort to develop techniques for understanding environments using sound, many studies in the field of computational auditory scene analysis have focused on using computers to perform functions carried out naturally by the human auditory system. Thanks to recent progress in machine‐learning techniques, these environmental sound‐processing techniques have significantly improved and a widening variety of applications has resulted in considerable interest in this field. In this review, we introduce the fundamental techniques of environmental sound processing, as well as recent advances in front‐end and back‐end processing and potential applications for these techniques. Prospects for further progress in the field of environmental sound processing and the challenges still to be overcome are also discussed. © 2019 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

show abstract

Picognizer: A JavaScript Library for Detecting and Recognizing Synthesized Sounds

Kurihara

Itaya

Uemura

et al. 2018

Advances in Computer Entertainment Technology

View full text Add to dashboard Cite

Acoustic Scene Analysis Based on Hierarchical Generative Model of Acoustic Event Sequence

Cited by 32 publications

References 16 publications

Acoustic scene classification based on generative model of acoustic spatial words for distributed microphone array

Acoustic scene classification based on generative model of acoustic spatial words for distributed microphone array

Environmental sound processing and its applications

Picognizer: A JavaScript Library for Detecting and Recognizing Synthesized Sounds

Contact Info

Product

Resources

About