Auditory scene analysis based on time‐frequency integration of shared FM and AM (II): Optimum time‐domain integration and stream sound reconstruction

Abe, Mototsugu; Ando, Shigeru

doi:10.1002/scj.1160

Cited by 5 publications

(5 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For this reason, in the proposed method, the filter is a narrow band, and concretely, 1/8 oct is adopted. Short time exception of this condition such as crossing or neighboring of the harmonic components of different sounds is made harmless in the stage of time-axis integration to be reported in the succeeding paper [21].…”

Section: Time-frequency Decomposition Of Mixed Harmonic Soundmentioning

confidence: 88%

“…In the image processing, the Hough transform [25] used for the detection of lines also has the same structure essentially. In this paper, we will deal only with the integration of the frequency axis which will not be affected by the time variation of the stream, and the time-axis integration will be dealt with in the succeeding paper [21].…”

Section: Integration Of Shared Attributes By Voting Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Auditory scene analysis based on time‐frequency integration of shared FM and AM (I): Lagrange differential features and frequency‐axis integration

Abe¹,

Ando²

2002

Systems & Computers in Japan

Self Cite

View full text Add to dashboard Cite

SUMMARYWe will propose in this paper a new algorithm for a computational implementation of auditory scene analysis. This algorithm forms a three-layer structure of (1) subband decomposition by wavelet transform, (2) characterization of subband signal fragments by instantaneous frequency, frequency change rate, and amplitude change rate, and (3) frequency integration of subband signal features by voting method. We will perform the grouping and integration by voting the subband signal fragments into a nonparametric multipeak probability density distribution expressing "possibility of streams"; and then the recognition of the streams and the extraction of the stream parameters are realized by tracing its greatest point. It is confirmed from basic experiments for synthesized sounds and voices that the fundamental frequency/frequency change rate/amplitude change rate can be separated and estimated from multiple streams.

show abstract

Section: Time-frequency Decomposition Of Mixed Harmonic Soundmentioning

confidence: 88%

Section: Integration Of Shared Attributes By Voting Methodsmentioning

confidence: 99%

Auditory scene analysis based on time‐frequency integration of shared FM and AM (I): Lagrange differential features and frequency‐axis integration

Abe¹,

Ando²

2002

Systems & Computers in Japan

Self Cite

View full text Add to dashboard Cite

show abstract

“…Goto presented a method to track F0 of objective single sound from polyphonic musical signals without restriction of the number of simultaneous sounds [2]. Some other multipitch analyzers such as graphical modelbased [3], filterbank-based [4], nonparametric Kalman filtering-based [5], [6]. Then a method for multipitch analysis called Harmonic-Temporal Clustering (HTC) was proposed [7] to deal with the harmonic and temporal structures in both time and frequency directions and shows high performance.…”

Section: Introductionmentioning

confidence: 99%

Flexible Harmonic Temporal Structure for Modeling Musical Instrument

Kitano

Nishimoto

et al. 2010

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Multipitch estimation is an important and difficult problem in entertainment computing. In this paper a flexible harmonic temporal structure for modeling musical instrument was proposed for estimating pitch in real music. Unlike the previous research, the proposed model does multipitch estimation according to the specific characteristics of specific musical instrument and uses EM algorithm to estimate the parameters in the model. Through choosing parameters suitable for its own characters for specific instrument, the proposed model preponderated over the common model.

show abstract

“…Until now, numerous multi-pitch detection methods have been reported not only in speech signal processing [1,2] but also in musical signal processing [3,4,5] and auditory scene analysis [6,7]. Chazan et al addressed a speech separation method by introducing a time warped signal model which allows a continuous pitch variations within a long analysis frame [1].…”

Section: Introductionmentioning

confidence: 99%

Multi-pitch detection algorithm using constrained Gaussian mixture model and information criterion for simultaneous speech

Kameoka,

Nishimoto,

Sagayama

2004

Speech Prosody 2004

View full text Add to dashboard Cite

In this paper, a co-channel multi-pitch detection algorithm is described. We suggest the importance of this when prosodic information is need to be extracted separately from respective F 0 patterns of concurrent utterances. Though temporal continuity of speech prosody should be considered, we discuss a process done independently on each single frame as the first step. A model of multiple harmonic structures is constructed with a mixture of tied Gaussian mixtures with which a single harmonic structure is modeled. Our algorithm enables to detect both a number of concurrent speakers, and each spectral envelope of underlying harmonic structure based on a maximum likelihood estimation of the model parameters using EM algorithm and an information criterion. It operates without a priori information of F 0 contours and a restriction of a number of speakers, and it also extracts accurate F 0 s as continuous values with simple procedures in spectral domain. Experiments showed our algorithm outperformed well-known cepstrum for both speech signals of a single speaker and simultaneous two speakers.

show abstract

Auditory scene analysis based on time‐frequency integration of shared FM and AM (II): Optimum time‐domain integration and stream sound reconstruction

Cited by 5 publications

References 12 publications

Auditory scene analysis based on time‐frequency integration of shared FM and AM (I): Lagrange differential features and frequency‐axis integration

Auditory scene analysis based on time‐frequency integration of shared FM and AM (I): Lagrange differential features and frequency‐axis integration

Flexible Harmonic Temporal Structure for Modeling Musical Instrument

Multi-pitch detection algorithm using constrained Gaussian mixture model and information criterion for simultaneous speech

Contact Info

Product

Resources

About