2021 IEEE 7th World Forum on Internet of Things (WF-IoT) 2021
DOI: 10.1109/wf-iot51360.2021.9596007
|View full text |Cite
|
Sign up to set email alerts
|

Environmental Sound Classification with Tiny Transformers in Noisy Edge Environments

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(13 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…However, both studies required feature extraction for inference and achieved classification accuracies of less than 80%. Wyatt et al [34] and Elliott et al [9] developed transformer-based models applicable to edge devices and tested them on the Raspberry Pi Zero and Samsung Galaxy S9 devices, respectively. These on-device ESC studies with lightweight architectures demonstrated the potential for ESC on edge devices.…”
Section: Related Workmentioning
confidence: 99%
“…However, both studies required feature extraction for inference and achieved classification accuracies of less than 80%. Wyatt et al [34] and Elliott et al [9] developed transformer-based models applicable to edge devices and tested them on the Raspberry Pi Zero and Samsung Galaxy S9 devices, respectively. These on-device ESC studies with lightweight architectures demonstrated the potential for ESC on edge devices.…”
Section: Related Workmentioning
confidence: 99%
“…Regarding Pure Transformers, Elliott et al [19], Wyatt et al [20], and Devlin et al [25] explored the advantages of Bidirectional-Encoder-Representations-from-Transformers (BERT)based models, based on the work of Vaswani et al [26], having as the input a given token summed with the position embeddings, in order to address the sound classification problem at the edge.…”
Section: Transformers For Audio Classificationmentioning
confidence: 99%
“…More recently, attention mechanisms have been incorporated to focus on semantically important parts of the sound under study [13][14][15][16][17]. Lately, solutions based on attention models [11,18], particularly on Transformers [18][19][20][21][22], are being explored.…”
Section: Introductionmentioning
confidence: 99%
“…Cantarini et al [ 71 ] applied the harmonic percussive source separation technique to classify emergency siren sounds from road noise sounds. Wyatt et al [ 72 ] deployed a BERT-based environmental sound classification model on an RPi Zero to identify six different everyday sounds (Knock, Laugh, Keyboard Typing, Cough, Keys Jangling and Snap). All three studies mentioned above described standalone audio capture and recognition systems for varied environmental settings and did not describe a multimodal data acquisition system.…”
Section: Related Workmentioning
confidence: 99%