ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9414611
|View full text |Cite
|
Sign up to set email alerts
|

Improving Audio Anomalies Recognition Using Temporal Convolutional Attention Networks

Abstract: Anomalous audio in speech recordings is often caused by speaker voice distortion, external noise, or even electric interferences. These obstacles have become a serious problem in some fields, such as high-quality dubbing and speech processing. In this paper, a novel approach using a temporal convolutional attention network (TCAN) is proposed to tackle this problem. The use of temporal conventional network (TCN) can capture long range patterns using a hierarchy of temporal convolutional filters. To enhance the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…We used an attention mechanism for the classification tasks. This attention mechanism has been successfully demonstrated for a wide range of tasks (Huang andHain 2021, Jung et al 2021). By allocating weights, the attention mechanism extracts more significant features and increases their significance in the model class process.…”
Section: Attention Mechanismmentioning
confidence: 94%
“…We used an attention mechanism for the classification tasks. This attention mechanism has been successfully demonstrated for a wide range of tasks (Huang andHain 2021, Jung et al 2021). By allocating weights, the attention mechanism extracts more significant features and increases their significance in the model class process.…”
Section: Attention Mechanismmentioning
confidence: 94%
“…This reduces the amount of calculation and simplifies the network structure. In view of the traditional neural networks problem that time-series data modeling can only be extended by linearly stacking multi-layer convolutions, TCN uses dilated convolution to increase the receptive field of a layer to reduce the number of convolutional layers [27]. The network structure for a convolution kernel number of four and an expansion coefficient of one is shown in Figure 2.…”
Section: Deep Learning Modelsmentioning
confidence: 99%
“…To address the problem of linear stacking of multiple convolutions in traditional neural networks to dilate the modeling of time series, TCN reduces the number of convolutional layers by using dilated convolutions to increase the range of the receptive field of each layer [37]. The difference between dilated convolutions and ordinary convolutions is that dilated convolution allows interval sampling of input during convolution, and the sampling rate depends on dilation factors.…”
Section: Dilated Convolutionsmentioning
confidence: 99%