2022
DOI: 10.1121/10.0010257
|View full text |Cite
|
Sign up to set email alerts
|

Lightweight deep convolutional neural network for background sound classification in speech signals

Abstract: Recognizing background information in human speech signals is a task that is extremely useful in a wide range of practical applications, and many articles on background sound classification have been published. It has not, however, been addressed with background embedded in real-world human speech signals. Thus, this work proposes a lightweight deep convolutional neural network (CNN) in conjunction with spectrograms for an efficient background sound classification with practical human speech signals. The propo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 29 publications
0
3
0
Order By: Relevance
“…All of these will require detailed investigation and beyond the scope of this work. Additionally, in a real-world deployment, as the system make a transition from unguided to semi-guided or guided models with time pass, the system can identify different background sounds using sound classification approaches [60] , [61] , [62] and retrain the initial unguided cough model to obtain more robust models utilizing relevant background sounds by following approaches similar to the Federated learning, environment knowledge broadcast among users, and place discovery [63] , [64] , [65] , [66] , [67] , [68] . These are beyond the scope of this manuscript.…”
Section: Discussionmentioning
confidence: 99%
“…All of these will require detailed investigation and beyond the scope of this work. Additionally, in a real-world deployment, as the system make a transition from unguided to semi-guided or guided models with time pass, the system can identify different background sounds using sound classification approaches [60] , [61] , [62] and retrain the initial unguided cough model to obtain more robust models utilizing relevant background sounds by following approaches similar to the Federated learning, environment knowledge broadcast among users, and place discovery [63] , [64] , [65] , [66] , [67] , [68] . These are beyond the scope of this manuscript.…”
Section: Discussionmentioning
confidence: 99%
“…(1) the input is provided by raw audio data recorded by two hydrophones, which allows it to perform joint feature learning with passive whales, avoiding manual feature selection. Meanwhile, (2) an end-to-end data-driven approach brings us the possibility to capture more complex spatiotemporally correlated latent features of the two hydrophones through the main convolution operation (Chen and Schmidt, 2021;Dayal et al, 2022).…”
Section: Training Processmentioning
confidence: 99%
“…CNNs also show great potential when being used on spectrogram data based on time-frequency representations of spatiotemporal signals. This has been applied to audio signals (Dayal et al, 2022), speech recognition (Badshah et al, 2017), but even in the context of gait classification (Jung et al, 2019). In clinical contexts with PD, recent studies have used this approach on EEG data (Khare et al, 2021;.…”
Section: Introductionmentioning
confidence: 99%