2024
DOI: 10.4218/etrij.2023-0354
|View full text |Cite
|
Sign up to set email alerts
|

Spoken‐to‐written text conversion for enhancement of Korean–English readability and machine translation

HyunJung Choi,
Muyeol Choi,
Seonhui Kim
et al.

Abstract: The Korean language has written (formal) and spoken (phonetic) forms that differ in their application, which can lead to confusion, especially when dealing with numbers and embedded Western words and phrases. This fact makes it difficult to automate Korean speech recognition models due to the need for a complete transcription training dataset. Because such datasets are frequently constructed using broadcast audio and their accompanying transcriptions, they do not follow a discrete rule‐based matching pattern. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 19 publications
0
1
0
Order By: Relevance
“…The use of high-quality and adequate data for addressed application tasks is key to achieve stable high performance. The tenth paper in this special issue [10], "Spoken-to-written text conversion for enhancement of Korean-English readability and machine translation" by Choi and others, addresses the problem that Korean text produced by automatic speech recognition is often not presented in the written but in the spoken form, particularly when including numeric expressions and English words. Consequently, frequent ambiguities occur in similar types of errors for automatic speech translation.…”
mentioning
confidence: 99%
“…The use of high-quality and adequate data for addressed application tasks is key to achieve stable high performance. The tenth paper in this special issue [10], "Spoken-to-written text conversion for enhancement of Korean-English readability and machine translation" by Choi and others, addresses the problem that Korean text produced by automatic speech recognition is often not presented in the written but in the spoken form, particularly when including numeric expressions and English words. Consequently, frequent ambiguities occur in similar types of errors for automatic speech translation.…”
mentioning
confidence: 99%