What makes audio event detection harder than classification?

Phan, Huy; Koch, Philipp; Katzberg, Fabrice; Maass, Marco; Mazur, Radoslaw; McLoughlin, Ian; Mertins, Alfred

doi:10.23919/eusipco.2017.8081709

Cited by 21 publications

(16 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In surveillance SED architectures consist of additional background subtraction,object tracking & situational analysis processes in the pipeline [8]. Latest research of an improved pipeline suggest a verification step to reduce false positives after the SED process [3]. Figure 1 shows the extended SED pipeline.…”

Section: Sound Event Detectionmentioning

confidence: 99%

“…The Sound event detection (SED) research field has been an active recently [1][2][3][4]. Autonomous audio surveillance has become more efficient as that artificial intelligence has stepped up the game [5].…”

Section: Introductionmentioning

confidence: 99%

“…Acoustic surveillance specifically in security application is still new to the world and requires research to allow better performances [4,7,8]. The detection on anomalous audio events effectively while avoiding false positives is crucial in security [3]. Two categories of sound recognition, non-speech the determination of the sound event source and speech recognition of verbal language [9][10].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Review of anomalous sound event detection approaches

Affendi

Yusoff

2019

IJ-AI

View full text Add to dashboard Cite

<p>This paper presents a review of anomalous sound event detection(SED) approaches. SED is becoming more applicable for real-world appliactaions such as security, fire determination or olther emergency alarms. Despite many research outcome previously, further research is required to reduce false positives and improve accurracy. SED approaches are comprehensively organized by methods covering system pipeline components of acoustic descriptors, classification engine, and decision finalization method. The review compares multiple approaches that is applied on a specific dataset. Security relies on anomalous events in order to prevent it one must find these anomalous events. Audio surveillance has become more efficient as that artificial intelligence has stepped up the game. Autonomous SED could be used for early detection and prevention. It is found that the state of the art method viable used in SED using features of log-mel energies in convolutional recurrent neural network(CRNN) with long short term memory(LSTM) with a verification step of thresholding has obtained 93.1% F1 score and 0.1307 ER. It is found that feature extraction of log mel energies are highly reliable method showing promising results on multiple experiments.</p>

show abstract

Section: Sound Event Detectionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Review of anomalous sound event detection approaches

Affendi

Yusoff

2019

IJ-AI

View full text Add to dashboard Cite

show abstract

“…Both classification and detection is used in this work, the first study to distinguish take-off from landing and the second to precisely detect, in time, the occurrence of both events. Even though it has been accepted that audio event classification is easier to deal with than detection, recent works emphasize the importance of detection [18], which is of particular importance in this work. In fact, as discussed later, distinguishing take-off from landing will be straightforward, so more effort will be put on precise time detection.…”

Section: Methodsmentioning

confidence: 99%

Audio-Based System for Automatic Measurement of Jump Height in Sports Science

Pueo

Lopez

Jiménez-Olmedo

2019

Sensors

View full text Add to dashboard Cite

Jump height tests are employed to measure the lower-limb muscle power of athletic and non-athletic populations. The most popular instruments for this purpose are jump mats and, more recently, smartphone apps, which compute jump height through manual annotation of video recordings to extract flight time. This study developed a non-invasive instrument that automatically extracts take-off and landing events from audio recordings of jump executions. An audio signal processing algorithm, specifically developed for this purpose, accurately detects and discriminates the landing and take-off events in real time and computes jump height accordingly. Its temporal resolution theoretically outperforms that of flight-time-based mats (typically 1000 Hz) and high-speed video rates from smartphones (typically 240 fps). A validation study was carried out by comparing 215 jump heights from 43 active athletes, measured simultaneously with the audio-based system and with of a validated, commercial jump mat. The audio-based system produced nearly identical jump heights than the criterion with low and proportional systematic bias and random errors. The developed audio-based system is a trustworthy instrument for accurately measuring jump height that can be readily automated as an app to facilitate its use both in laboratories and in the field.

show abstract

“…The research was performed when H. Phan was at the University of Oxford and supported by the NIHR Oxford Biomedical Research Centre. Corresponding author: h.phan@kent.ac.uk reduction [22,7]. Particularly, the multitasking approach that jointly performs event detection and event boundary estimation [23,6,24] has demonstrated state-of-the-art performance on different benchmark datasets.…”

Section: Introductionmentioning

confidence: 99%

Unifying Isolated and Overlapping Audio Event Detection with Multi-label Multi-task Convolutional Recurrent Neural Networks

Phan

Chén

Koch

et al. 2019

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

We propose a multi-label multi-task framework based on a convolutional recurrent neural network to unify detection of isolated and overlapping audio events. The framework leverages the power of convolutional recurrent neural network architectures; convolutional layers learn effective features over which higher recurrent layers perform sequential modelling. Furthermore, the output layer is designed to handle arbitrary degrees of event overlap. At each time step in the recurrent output sequence, an output triple is dedicated to each event category of interest to jointly model event occurrence and temporal boundaries. That is, the network jointly determines whether an event of this category occurs, and when it occurs, by estimating onset and offset positions at each recurrent time step. We then introduce three sequential losses for network training: multi-label classification loss, distance estimation loss, and confidence loss. We demonstrate good generalization on two datasets: ITC-Irst for isolated audio event detection, and TUT-SED-Synthetic-2016 for overlapping audio event detection.

show abstract

What makes audio event detection harder than classification?

Cited by 21 publications

References 23 publications

Review of anomalous sound event detection approaches

Review of anomalous sound event detection approaches

Audio-Based System for Automatic Measurement of Jump Height in Sports Science

Unifying Isolated and Overlapping Audio Event Detection with Multi-label Multi-task Convolutional Recurrent Neural Networks

Contact Info

Product

Resources

About