Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing 2015
DOI: 10.18653/v1/d15-1148
|View full text |Cite
|
Sign up to set email alerts
|

Detecting Content-Heavy Sentences: A Cross-Language Case Study

Abstract: The information conveyed by some sentences would be more easily understood by a reader if it were expressed in multiple sentences. We call such sentences content heavy: these are possibly grammatical but difficult to comprehend, cumbersome sentences. In this paper we introduce the task of detecting content-heavy sentences in cross-lingual context. Specifically we develop methods to identify sentences in Chinese for which English speakers would prefer translations consisting of more than one sentence. We base o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
6
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 25 publications
0
6
0
Order By: Relevance
“…We further rearranged component texts so that the flow of events they depict comes in line with that of the complex sentence. We emphasize that in contrast to Li and Nenkova ( 2015 ), this study is not about identifying conditions under which people favor a split sentence.…”
mentioning
confidence: 79%
“…We further rearranged component texts so that the flow of events they depict comes in line with that of the complex sentence. We emphasize that in contrast to Li and Nenkova ( 2015 ), this study is not about identifying conditions under which people favor a split sentence.…”
mentioning
confidence: 79%
“…Failure in doing so may result in a translated sentence being hard to process for speakers in the target language. To understand whether this phenomenon is important enough for human and system translators to consider, I showed that sentences need to be translated into multiple English sentences cause significant quality drop for MT, while the number of words in them has little correlation with MT quality (Li, Carpuat, and Nenkova 2014;Li and Nenkova 2015a). From a translator's point of view, more than 15% of the sentences were translated into multiple sentences in English by at least three out of four human translators.…”
Section: Discourse Structure Variancementioning
confidence: 99%
“…To identify sentences that need a multi-sentence translation, I designed a system that achieves more than 80% accuracy (Li and Nenkova 2015a). Next I plan to explore methods to improve the flow of MT outputs for these sentences.…”
Section: Discourse Structure Variancementioning
confidence: 99%
“…While claim difficulty prediction appears novel in the context of fact-checking, we note that the value of modeling and predicting difficulty of different task instances is already recognized in other areas, such as machine translation [14,13], syntactic parsing [8], or search engine switching behaviors [29], among others. From this perspective, we argue instance difficulty prediction can bring similar value to fact-checking.…”
Section: Introductionmentioning
confidence: 95%