2015
DOI: 10.1007/978-3-319-23132-7_13
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Close Captioning for Live Hungarian Television Broadcast Speech: A Fast and Resource-Efficient Approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
5
2
2

Relationship

2
7

Authors

Journals

citations
Cited by 12 publications
(8 citation statements)
references
References 5 publications
0
8
0
Order By: Relevance
“…This dataset contains various genres (weather forecasts, news, conversations, magazines, sport). The dataset used for pre-training the character-and word-level models is a subset with manual transcription including punctuation containing 12M, 3M and 136k words for the train, validation and test sets, respectively [20]. The punctuation marks addressed in the experiments include commas, periods, question marks and exclamation marks.…”
Section: Datamentioning
confidence: 99%
“…This dataset contains various genres (weather forecasts, news, conversations, magazines, sport). The dataset used for pre-training the character-and word-level models is a subset with manual transcription including punctuation containing 12M, 3M and 136k words for the train, validation and test sets, respectively [20]. The punctuation marks addressed in the experiments include commas, periods, question marks and exclamation marks.…”
Section: Datamentioning
confidence: 99%
“…We have overall 500 sentences and 8k word tokens in total. We use the Kaldi version of the ASR in [17] (with Kaldi decoder) by 6.8%, 10.1%, and 21.4% Word Error Rates (WER) on weather forecasts, broadcast news and sport news, respectively. For AP (automatic punctuation) we use the model from [10] and obtain F1-measures in the range of 60-70% on MT (manual transcript) and 45-50% on AT (ASR transcript).…”
Section: Datasetsmentioning
confidence: 99%
“…In left-marked style (+m), a subword is prefixed with a character to indicate that there was no word boundary directly preceding the subword. This style has been used for Turkish [5] and Hungarian [6,7]. In [7], it was shown to outperform word boundary tags.…”
Section: Boundary Markersmentioning
confidence: 99%
“…This style has been used for Turkish [5] and Hungarian [6,7]. In [7], it was shown to outperform word boundary tags. In right-marked style (m+), a suffix marker is added to a subword if there is no word boundary after it.…”
Section: Boundary Markersmentioning
confidence: 99%