8th European Conference on Speech Communication and Technology (Eurospeech 2003) 2003
DOI: 10.21437/eurospeech.2003-779
|View full text |Cite
|
Sign up to set email alerts
|

Using the web for fast language model construction in minority languages

Abstract: The design and construction of a language model for minority languages is a hard task. By minority language, we mean a language with small available resources, especially for the statistical learning problem. In this paper, a new methodology for fast language model construction in minority languages is proposed. It is based on the use of Web resources to collect and make efficient textual corpora. By using efficient filtering techniques, this methodology allows a quick and efficient construction of a language … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2013
2013
2014
2014

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(1 citation statement)
references
References 6 publications
0
1
0
Order By: Relevance
“…Technology has been, and will continue to be, used to help spread linguistic knowledge and preserve indigenous languages. We have seen this occur in many fashions, such as the creation and use of dictionaries and corpora (Banea, Mihalcea, & Wiebe, 2008;Frawley, Hill, & Munro, 2002;Le, Bigi, Besacier, & Castelli, 2003), automatic speech recognition systems (Barnard, Davel, & van Heerden 2009;Barnard, Davel, van Huyssteen, 2010;Barnard, Davel, van Heerden, de Wet, and Badenhorst, 2014;Besacier et al, 2014;Le & Besacier, 2009), and natural language processing and language engineering 3 (Aduriz et al, 2006;Elliott, Glauert, Kennaway, Marshall, & Safar, 2008;Sarasola, 2000;Streiter, Scannell, & Stuflesser, 2006). All of these technologies either can and have been used towards the development of CALL tools or are already used as CALL tools in their current form.…”
Section: Methodsmentioning
confidence: 99%
“…Technology has been, and will continue to be, used to help spread linguistic knowledge and preserve indigenous languages. We have seen this occur in many fashions, such as the creation and use of dictionaries and corpora (Banea, Mihalcea, & Wiebe, 2008;Frawley, Hill, & Munro, 2002;Le, Bigi, Besacier, & Castelli, 2003), automatic speech recognition systems (Barnard, Davel, & van Heerden 2009;Barnard, Davel, van Huyssteen, 2010;Barnard, Davel, van Heerden, de Wet, and Badenhorst, 2014;Besacier et al, 2014;Le & Besacier, 2009), and natural language processing and language engineering 3 (Aduriz et al, 2006;Elliott, Glauert, Kennaway, Marshall, & Safar, 2008;Sarasola, 2000;Streiter, Scannell, & Stuflesser, 2006). All of these technologies either can and have been used towards the development of CALL tools or are already used as CALL tools in their current form.…”
Section: Methodsmentioning
confidence: 99%