2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013
DOI: 10.1109/icassp.2013.6639278
|View full text |Cite
|
Sign up to set email alerts
|

System combination and score normalization for spoken term detection

Abstract: Spoken content in languages of emerging importance needs to be searchable to provide access to the underlying information. In this paper, we investigate the problem of extending data fusion methodologies from Information Retrieval for Spoken Term Detection on low-resource languages in the framework of the IARPA Babel program. We describe a number of alternative methods improving keyword search performance. We apply these methods to Cantonese, a language that presents some new issues in terms of reduced resourc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
50
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
3
3
2

Relationship

2
6

Authors

Journals

citations
Cited by 49 publications
(50 citation statements)
references
References 27 publications
0
50
0
Order By: Relevance
“…The specific method used in this work, which is described in more detail below, is to (1) apply sum-toone normalization to each postings list, (2) combine the results using MTWV-weighted CombMNZ fusion, and finally (3) apply sum-toone normalization to the fused postings list to produce the final output. Experiments that led to this particular strategy are described in detail in [24].…”
Section: Score Normalization and System Combination For Abstractearchmentioning
confidence: 99%
See 1 more Smart Citation
“…The specific method used in this work, which is described in more detail below, is to (1) apply sum-toone normalization to each postings list, (2) combine the results using MTWV-weighted CombMNZ fusion, and finally (3) apply sum-toone normalization to the fused postings list to produce the final output. Experiments that led to this particular strategy are described in detail in [24].…”
Section: Score Normalization and System Combination For Abstractearchmentioning
confidence: 99%
“…Sum-to-one normalization does this, using the sum of the query detection scores as a proxy for query frequency. System combination is performed using an MTWV-weighted version [24] of the CombMNZ method [27]. First, detection scores in each postings list are weighted in proportion to the list's MTWV score on the tuning portion of the development set.…”
Section: Score Normalization and System Combination For Abstractearchmentioning
confidence: 99%
“…By fusing KWS results from diverse systems, we can usually get a much better KWS result. For fusing results of different systems, arithmetic-based fusion methods such as CombSum [1,2], CombMNZ [1,2], CombGMNZ [1], WCombMNZ [2] have been proved to be quite effective. Pham et al [3] proposed the system and keyword dependent fusion method SKDWCombMNZ in 2014, which ourperformed other arithmetic-based methods.…”
Section: Introductionmentioning
confidence: 99%
“…For the two measures, score normalization [2,7,8] has been proved to be essential. Keyword specific threshold (KST) normalization [9] and sum-to-one (STO) normalization [2] are the two mainstream score normalization methods. In our work, we compare the performance of the two methods when they are applied both before and after system fusion.…”
Section: Introductionmentioning
confidence: 99%
“…For target languages that are rare or spoken only in regions where collection is difficult or even impossible, LRs can be scarce or non-existent, imposing major, often irresolvable, constraints on training acoustic models in these languages. In dealing with under-resourced or zero-resource languages, research is focused on finding robust KWS solutions using techniques such as subspace-GMM acoustic modeling, [25] multiple system combination and score normalization, [26] and bootstrapping techniques utilizing multilanguage acoustic models and neural networks. [9,27,28] When adapting a PS KWS system to process data in an under-resourced language, it is possible to bypass the long and costly training process by utilizing phoneme acoustic models from accessible and well-trained languages.…”
Section: Abstractpotting In Under-resourced Languagesmentioning
confidence: 99%