2018
DOI: 10.1007/s10936-018-9559-6
|View full text |Cite
|
Sign up to set email alerts
|

Coreferential Relations in Basque: The Annotation Process

Abstract: In this paper we present the coreferential tagging of part of the EPEC Corpus of Basque. Although coreference is a pragmatic linguistic phenomenon highly dependent on the situational context, it shows some language-specific patterns that vary according to the features of each language. Due to the fact that Basque is not an Indo-European language, it differs considerably in grammar from the languages spoken in surrounding areas. We will explain these features and the decisions made in each case. After describin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 11 publications
0
6
0
1
Order By: Relevance
“…For the next experiments two corpora for coreference resolution are used, the EPEC-KORREF corpus (Ceberio et al, 2018) for Basque, the target language, and the OntoNotes English corpus (Hovy et al, 2006).…”
Section: Corporamentioning
confidence: 99%
See 1 more Smart Citation
“…For the next experiments two corpora for coreference resolution are used, the EPEC-KORREF corpus (Ceberio et al, 2018) for Basque, the target language, and the OntoNotes English corpus (Hovy et al, 2006).…”
Section: Corporamentioning
confidence: 99%
“…For the next experiments two corpora for coreference resolution are used, the EPEC-KORREF corpus (Ceberio et al, 2018) for Basque, the target language, and the OntoNotes English corpus (Hovy et al, 2006 OntoNotes corpus is an English corpus with text from a variety of domains of more than one million words, with annotated mentions and coreferential relations. We used only newswire (nw), and broadcast news (bn) sets, avoiding conversation sets, in order to have texts of the same domain (around 825K words and 100K mentions).…”
Section: Corporamentioning
confidence: 99%
“…With reference to the subtask of mention detection, in this section we establish what mentions we regard as potential ones to be included in a coreference chain based on a linguistically motivated mention classification presented in [37].…”
Section: Mention Structures In Basquementioning
confidence: 99%
“…Aurreprozesaketaren abiapuntua, EPEC-KORREF (Ceberio et al, 2018) corpusa da, 45.000 hitz eta 12.000 aipamen dituena. Corpus honek aipamenak eta korreferentzia erlazioak eskuz anotatuta ditu, korreferentziaebazpenerako baliagarriak izan daitezkeen beste ezaugarri askorekin batera.…”
Section: Aurreprozesaketaunclassified