2019
DOI: 10.1186/s13636-019-0156-x
|View full text |Cite
|
Sign up to set email alerts
|

Search on speech from spoken queries: the Multi-domain International ALBAYZIN 2018 Query-by-Example Spoken Term Detection Evaluation

Abstract: The huge amount of information stored in audio and video repositories makes search on speech (SoS) a priority area nowadays. Within SoS, Query-by-Example Spoken Term Detection (QbE STD) aims to retrieve data from a speech repository given a spoken query. Research on this area is continuously fostered with the organization of QbE STD evaluations. This paper presents a multi-domain internationally open evaluation for QbE STD in Spanish. The evaluation aims at retrieving the speech files that contain the queries,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 72 publications
0
2
0
Order By: Relevance
“…These include THUMOS 14 (Jiang et al, 2014) as well as ActivityNet 1.2 and ActivityNet 1.3 challenges (Fabian Caba Heilbron and Niebles, 2015). Another example is queryby-example spoken term detection, as considered e.g., in ALBAYZIN 2018 challenge (Tejedor et al, 2019).…”
Section: Review Of Existing Datasetsmentioning
confidence: 99%
“…These include THUMOS 14 (Jiang et al, 2014) as well as ActivityNet 1.2 and ActivityNet 1.3 challenges (Fabian Caba Heilbron and Niebles, 2015). Another example is queryby-example spoken term detection, as considered e.g., in ALBAYZIN 2018 challenge (Tejedor et al, 2019).…”
Section: Review Of Existing Datasetsmentioning
confidence: 99%
“…However, this approach cannot meet the requirements of speed and quality at the same time in practical applications. Thus, to avoid the decoding process of ASR, some methods [4][5][6] directly use the acoustic modeling part of ASR model to extract the features of audio signals, and then compare these features of different lengths by dynamic time wrapping (DTW) [7].…”
Section: Introductionmentioning
confidence: 99%