2022
DOI: 10.11591/ijece.v12i2.pp1477-1487
|View full text |Cite
|
Sign up to set email alerts
|

Robust cepstral feature for bird sound classification

Abstract: Birds are excellent environmental indicators and may indicate sustainability of the ecosystem; birds may be used to provide provisioning, regulating, and supporting services. Therefore, birdlife conservation-related researches always receive centre stage. Due to the airborne nature of birds and the dense nature of the tropical forest, bird identifications through audio may be a better solution than visual identification. The goal of this study is to find the most appropriate cepstral features that can be used … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(4 citation statements)
references
References 27 publications
0
4
0
Order By: Relevance
“…Commonly, a single metric, the F1 score, which combines image precision and recall [23,24], is used as a metric. F1 score is defined as the harmonic mean between precision and recall and may be determined using…”
Section: CIII Automated Cell Counting Of Histology Imagesmentioning
confidence: 99%
“…Commonly, a single metric, the F1 score, which combines image precision and recall [23,24], is used as a metric. F1 score is defined as the harmonic mean between precision and recall and may be determined using…”
Section: CIII Automated Cell Counting Of Histology Imagesmentioning
confidence: 99%
“…In recent years various researchers have proposed deep learning-based methods for activity sequence learning, which is inspired by the effectiveness of deep learning in applications, including video captioning [36], audio recognition [37], neural machine translation [38], image recognition [39], and speech recognition [40][41] [42]. Other deep learning approaches based on skeletal data include Recurrent Neural Network (RNN) [5][43], Convolutional Neural Network (CNN) [44], and Graph Convolutional Network (GCN) [45].…”
Section: Related Workmentioning
confidence: 99%
“…Time windows of multiple durations and varying resolutions have been considered for the segmentation and feature representations. These features have been individually fed into a Support Vector Machine (SVM) classifier [29] for native language identification. Performances of the different features, effects of time window duration, and resolutions have been compared.…”
Section: Speech Features Analysismentioning
confidence: 99%