DOI: 10.14232/phd.2291
|View full text |Cite
|
Sign up to set email alerts
|

Uncertainty Detection in Natural Language Texts

Abstract: Uncertainty detection is essential for many NLP applications. For instance, in information retrieval, it is of primary importance to distinguish among factual, negated and uncertain information. Current research on uncertainty detection has mostly focused on the English language, in contrast, here we present the first machine learning algorithm that aims at identifying linguistic markers of uncertainty in Hungarian texts from two domains: Wikipedia and news media. The system is based on sequence labeling and m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
28
0
2

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 35 publications
(30 citation statements)
references
References 63 publications
0
28
0
2
Order By: Relevance
“…The uncertainty annotation of this text differed greatly from our corpus of Hungarian Wikipedia articles and news (Vincze, 2014), which domains are much closer to standard language use. Table 1 shows the distribution of the different types of uncertainty cues in these domains.…”
Section: Uncertainty In Hungarian Webtextmentioning
confidence: 93%
“…The uncertainty annotation of this text differed greatly from our corpus of Hungarian Wikipedia articles and news (Vincze, 2014), which domains are much closer to standard language use. Table 1 shows the distribution of the different types of uncertainty cues in these domains.…”
Section: Uncertainty In Hungarian Webtextmentioning
confidence: 93%
“…As I mentioned earlier in Chapter 1, there is a plethora of work on uncertainty annotation for English (Rubin, 2007;Szarvas et al, 2008;Saurí and Pustejovsky, 2009;Matsuyoshi et al, 2010;Farkas et al, 2010;Rubinstein et al, 2013;Wei et al, 2013), Japanese (Hendrickx et al, 2012), Chinese (Cui and Chi, 2013), Portuguese (Hendrickx et al, 2012;Avila and Mello, 2013), and Hungarian (Vincze, 2014). However, prior to my own preliminary work (Al-Sabbagh et al, 2014a) and the work I presented in this chapter, there are no Arabic uncertainty-annotated corpora, to the best of my knowledge.…”
Section: Related Workmentioning
confidence: 99%
“…ere is a plethora of work on English (Matsuyoshi et al, 2010;Prabhakaran, 2010;Rubin, 2007;Rubinstein et al, 2013;Ruppenhofer and Rehbein, 2012;Saurí and Pustejovsky, 2009;Szarvas et al, 2008;Tang et al, 2010;Vincze et al, 2011;Wei et al, 2013), French (Goujon, 2009), Portuguese (Hendrickx et al, 2012;Avila and Mello, 2013), and Swedish (Mowery et al, 2012), yet nothing for agglutinative morphologically-rich languages except for Hungarian (Vincze, 2014). One reason for the li le research on agglutinative morphologically-rich languages with a exible word order is the lack of uncertainty-annotated corpora for such languages.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Such tasks include information extraction, question answering, medical information retrieval, opinion detection, sentiment analysis (Karttunen and Zaenen, 2005;Vincze, 2014a;Díaz et al, 2016) and knowledge base population (KBP). In KBP, we need to distinguish, e.g., "X may be Basque" and "X was rumored to be Basque" (uncertain) from "X is Basque" (certain) to decide whether to add the fact "Basque(X)" to a knowledge base.…”
Section: Introductionmentioning
confidence: 99%