2022
DOI: 10.3390/ani12182434
|View full text |Cite
|
Sign up to set email alerts
|

An Efficient Model for a Vast Number of Bird Species Identification Based on Acoustic Features

Abstract: Birds have been widely considered crucial indicators of biodiversity. It is essential to identify bird species precisely for biodiversity surveys. With the rapid development of artificial intelligence, bird species identification has been facilitated by deep learning using audio samples. Prior studies mainly focused on identifying several bird species using deep learning or machine learning based on acoustic features. In this paper, we proposed a novel deep learning method to better identify a large number of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
8
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 22 publications
(8 citation statements)
references
References 60 publications
0
8
0
Order By: Relevance
“…For instance, Xie improved ELM by using differential evolution to classify MFCC features of nine birdsongs, with a maximum accuracy of 89.05% [46]. Wang et al fused Mel-spectrogram and MFCC as input features and used LSTM to recognize 264 birdsongs, with an average accuracy of 77.43% [47]. Murugaiya et al combined the improved GTCC feature with probability enhanced entropy to classify twenty bird sounds in Borneo using SVM, with an accuracy of 89.5% [48].…”
Section: Discussionmentioning
confidence: 99%
“…For instance, Xie improved ELM by using differential evolution to classify MFCC features of nine birdsongs, with a maximum accuracy of 89.05% [46]. Wang et al fused Mel-spectrogram and MFCC as input features and used LSTM to recognize 264 birdsongs, with an average accuracy of 77.43% [47]. Murugaiya et al combined the improved GTCC feature with probability enhanced entropy to classify twenty bird sounds in Borneo using SVM, with an accuracy of 89.5% [48].…”
Section: Discussionmentioning
confidence: 99%
“…In Tanttu et al (2003), tracking of the first harmonic components of the spectrogram is used to extract the characteristics and SOM (Unsupervised Learning) for the classification. Wang et al (2022) performs the Melspectrogram and MFCC of the songs, these are the inputs for the deep learning model, Long short-term memory (LSTM) is a variant of recurrent neural networks (RNN), which integrates specific gates to recover the shortterm or long-term context of the input, LSTM is used to extract features and classify the song. In Zhang et al (2019), the 3-D convolution kernels of the CNN were used to extract both positional and temporal characteristics from the Melspectrogram and with this improve the classification.…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, advances in machine learning have led to the proposal of deep learningbased bird classification methods, including audio features for classification and recognition [5][6][7] and image features [8][9][10]. Among them, bird classification methods relying on audio features are easily disrupted by environmental noise, necessitating audio denoising as a prerequisite in practical applications.…”
Section: Introductionmentioning
confidence: 99%