2013
DOI: 10.1007/978-3-642-41190-8_25
|View full text |Cite
|
Sign up to set email alerts
|

Using Various Types of Multimedia Resources to Train System for Automatic Transcription of Czech Historical Oral Archives

Abstract: Historical spoken documents represent a unique segment of national cultural heritage. In order to disclose the large Czech Radio audio archive to research community and to public, we have been developing a system whose aim is to transcribe automatically the archive files, index them and make them searchable. The transcription of contemporary (1 or 2 decades old) documents is based on the lexicon and statistical language model (LM) built from a large amount of recent texts available in electronic form. From the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2013
2013
2014
2014

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 10 publications
0
1
0
Order By: Relevance
“…Before doing it, we have to adapt the lexicons so that they better fit speech of previous historical epochs. For Czech, it has been already done [14].…”
Section: Discussionmentioning
confidence: 99%
“…Before doing it, we have to adapt the lexicons so that they better fit speech of previous historical epochs. For Czech, it has been already done [14].…”
Section: Discussionmentioning
confidence: 99%