Proceedings of the Fourth Italian Conference on Computational Linguistics CLiC-it 2017 2017
DOI: 10.4000/books.aaccademia.2360
|View full text |Cite
|
Sign up to set email alerts
|

Toward a bilingual lexical database on connectives: Exploiting a German/Italian parallel corpus

Abstract: We report on experiments to validate and extend two language-specific connective databases (German and Italian) using a word-aligned corpus. This is a first step toward constructing a bilingual lexicon on connectives that are connected via their discourse senses.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
1

Relationship

4
4

Authors

Journals

citations
Cited by 13 publications
(21 citation statements)
references
References 4 publications
0
21
0
Order By: Relevance
“…From a methodological viewpoint, lexical description and corpus-based research can mutually benefit from each other: as pointed out in Section 2.2, several approaches have been proposed to derive connective lists (which can be the starting point for a lexicon) from mono-or bilingual corpora (e.g., Versley, 2010); likewise, an existing list or lexicon can be verified and extended by mining distributionally similar words from corpora (e.g., Bourgonje et al, 2017). Moving beyond this initial phase of creating inventories, features of lexical description can be tested with corpora and lexical entries in turn be improved.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…From a methodological viewpoint, lexical description and corpus-based research can mutually benefit from each other: as pointed out in Section 2.2, several approaches have been proposed to derive connective lists (which can be the starting point for a lexicon) from mono-or bilingual corpora (e.g., Versley, 2010); likewise, an existing list or lexicon can be verified and extended by mining distributionally similar words from corpora (e.g., Bourgonje et al, 2017). Moving beyond this initial phase of creating inventories, features of lexical description can be tested with corpora and lexical entries in turn be improved.…”
Section: Resultsmentioning
confidence: 99%
“…In similar ways, also starting from existing labeled data for English, Zhou and Xue (2012) built a list of Chinese connectives, Hajlaoui and Popescu-Belis (2012) an Arabic one, and Laali and Kosseim (2014) a French one. Recently, Bourgonje et al (2017) used Europarl to extend the information in the existing German and Italian lexicons (DiMLex, LICO): they found additional connectives that ought to be added to the respective lexicons, and they also studied the mapping between the annotated senses, in order to find areas of overlap in readings, and to compare the degrees of connective ambiguity in the two languages. Feltracco et al, 2016), and Portuguese (LDM-PT; .…”
Section: Generating Lists Of Connectivesmentioning
confidence: 99%
“…Another approach for automatic generation of discourse connective lexicons is by translational mapping between parallel corpora, which we are pursuing in ongoing work (Bourgonje et al, 2017), following up on earlier studies such as that of Cartoni et al (2013). We hope to use this approach to identify additional connectives for DiMLex-Eng as well as establish and enhance correspondences between DiMLex-Eng and other similar connective lexicons.…”
Section: Discussionmentioning
confidence: 99%
“…Nonetheless, this procedure can significantly speed up the process of "bootstrapping" a lexicon (and, by running the process also in the opposite direction, also to validate the source lexicon). For the case of mapping German to Italian connectives (and backwards), the process is explained by Bourgonje et al (2017). Since the meaning of the corresponding connectives in the two texts can expected to be similar, the senses can also be mapped from the S lexicon to the T one, for a start.…”
Section: Acquiring the Set Of Lexical Itemsmentioning
confidence: 99%