2008
DOI: 10.1109/icassp.2008.4518607
|View full text |Cite
|
Sign up to set email alerts
|

On-demand new word learning using world wide web

Abstract: Most of the Web-based methods for lexicon augmenting consist in capturing global semantic features of the targeted domain in order to collect relevant documents from the Web. We suggest that the local context of the out-of-vocabulary (OOV) words contains relevant information. We propose to use the Web to build locally-augmented lexicons which are used in a final local decoding pass. Our experiments confirm the relevance of the Web for the OOV word retrieval. Different methods are proposed to retrieve the hypot… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2010
2010
2016
2016

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 5 publications
0
10
0
Order By: Relevance
“…OOV word recovery techniques have used LVCSR hypothesis to query search engines on the World Wide Web (WWW) [6][7][8]. From the retrieved documents, target OOV candidates are chosen using phone sequences observed in the pre-identified OOV region [6,8] or using words adjacent to the OOV region [7]. Vocabulary selection techniques use TF-IDF measures [9], frequency & recency of new words [10] or selection of all new PNs [11].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…OOV word recovery techniques have used LVCSR hypothesis to query search engines on the World Wide Web (WWW) [6][7][8]. From the retrieved documents, target OOV candidates are chosen using phone sequences observed in the pre-identified OOV region [6,8] or using words adjacent to the OOV region [7]. Vocabulary selection techniques use TF-IDF measures [9], frequency & recency of new words [10] or selection of all new PNs [11].…”
Section: Related Workmentioning
confidence: 99%
“…In-Vocabulary (IV) words hypothesised by Large Vocabulary Continuous Speech Recognition (LVCSR) are analysed for latent topic and lexical context, which then helps to retrieve relevant OOV PNs. The list of retrieved OOV PNs can now be used to recover target OOV PNs using phone matching [6], or additional speech recognition pass [7]; or spotting PNs in speech [8].…”
Section: Introductionmentioning
confidence: 99%
“…Reported experiments showed a relative reduction of 58% in OOV word rate. The work presented in [Oger et al, 2008] suggests that the local context of the OOV words contains relevant information about them. Using that information and the Web, different methods were proposed to build locally-augmented lexicons which are used in a final local decoding pass.…”
Section: Vocabulary Selection/adaptationmentioning
confidence: 99%
“…The vocabulary is augmented with new words and their pronunciations. For example the Web has been used as a source from where to retrieve relevant OOV words [8] [9]. A selection algorithm can be devised using temporal or topical information, retrieved from either metadata or firstpass output.…”
Section: Introductionmentioning
confidence: 99%