2020 28th European Signal Processing Conference (EUSIPCO) 2021
DOI: 10.23919/eusipco47968.2020.9287716
|View full text |Cite
|
Sign up to set email alerts
|

SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional Networks

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
38
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 53 publications
(38 citation statements)
references
References 21 publications
0
38
0
Order By: Relevance
“…Integrating the preprocessing into the network also increases inference speed due to the efficient implementation and hardware acceleration of deep-learning frameworks. Third and last, DAS learns a task-specific representation of sound features using temporal convolutional networks (TCNs) ( Bai et al, 2018 ; van den Oord et al, 2016 ; Guirguis et al, 2021 ; Figure 1—figure supplement 1A–E ). At the core of TCNs are so-called dilated convolutions ( Yu and Koltun, 2016 ).…”
Section: Resultsmentioning
confidence: 99%
“…Integrating the preprocessing into the network also increases inference speed due to the efficient implementation and hardware acceleration of deep-learning frameworks. Third and last, DAS learns a task-specific representation of sound features using temporal convolutional networks (TCNs) ( Bai et al, 2018 ; van den Oord et al, 2016 ; Guirguis et al, 2021 ; Figure 1—figure supplement 1A–E ). At the core of TCNs are so-called dilated convolutions ( Yu and Koltun, 2016 ).…”
Section: Resultsmentioning
confidence: 99%
“…Integrating the preprocessing into the network also increases inference speed due to the efficient implementation and hardware acceleration of deep-learning frameworks. Third and last, DAS learns a task-specific representation of sound features using temporal convolutional networks (TCNs) ( Bai et al, 2018 ; van den Oord et al, 2016 ; Guirguis et al, 2021 ) ( Figure 1–Figure Supplement 1 A-E). At the core of TCNs are so-called dilated convolutions ( Yu and Koltun, 2016 ).…”
Section: Resultsmentioning
confidence: 99%
“…In this survey paper, we do not address the problem of tracking on its own, which is usually done in a separate algorithm using the sequence of DoA estimates obtained by applying SSL on successive time windows [67]. Still, several deep-learningbased SSL systems are shown to produce more accurate localization of moving sources when they are trained on a dataset that includes this type of sources [68], [69], [70], [71]. In other cases, as the number of real-world datasets with moving sources is limited and the simulation of signals with moving sources is cumbersome, a number of systems are trained on static sources, but are also shown to retain fair to good performance on moving sources [63], [72].…”
Section: Moving Sourcesmentioning
confidence: 99%