2012
DOI: 10.1016/j.eswa.2011.06.058
|View full text |Cite
|
Sign up to set email alerts
|

DiSeg 1.0: The first system for Spanish discourse segmentation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0
5

Year Published

2012
2012
2019
2019

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 22 publications
(19 citation statements)
references
References 15 publications
0
14
0
5
Order By: Relevance
“…This article adopts the definition of discourse segment put forward by Tofiloski et al (2009: 77): ‘Discourse segmentation is the process of decomposing discourse into elementary discourse units (EDUs), which may be simple sentences or clauses in a complex sentence, and from which discourse trees are constructed’. Specifically, we use the criteria for discourse segmentation most used in Spanish described in da Cunha et al (2012b), da Cunha and Iruskieta (2010) and Iruskieta et al (2015). See, for instance, examples 1 and 2.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…This article adopts the definition of discourse segment put forward by Tofiloski et al (2009: 77): ‘Discourse segmentation is the process of decomposing discourse into elementary discourse units (EDUs), which may be simple sentences or clauses in a complex sentence, and from which discourse trees are constructed’. Specifically, we use the criteria for discourse segmentation most used in Spanish described in da Cunha et al (2012b), da Cunha and Iruskieta (2010) and Iruskieta et al (2015). See, for instance, examples 1 and 2.…”
Section: Methodsmentioning
confidence: 99%
“…First, data were manually extracted for sections, titles and moves (from the textual level). Second, the remaining data were extracted automatically (from the lexical and discourse levels), by using the following automatic Natural Language Processing (NLP) tools: a morphosyntactic analyzer (Freeling; Atserias et al, 2006), and a discourse segmentation system (DiSeg; da Cunha et al, 2012b).…”
Section: Methodsmentioning
confidence: 99%
“…This corpus, compared to the one used by [28, 55, 56] for Basque, contains 40 additional texts, as we included 2 new domains (economy and computer science). The size—140 texts—is similar to or larger than others created for similar aims, such as [40] (9 texts) and [44] (20 texts) for segmentation, and [60] (32 texts) and [11] (100 texts) for CU detection. The corpus in Table 2 was randomly divided into 3 non-overlapping datasets: 84 texts as the training set, 28 texts as the development set and 28 texts as the test set (Table 3).…”
Section: Methodsmentioning
confidence: 99%
“…There are several ways of pursuing the automatic segmentation task; using rule based techniques as in: i ) [28] for Basque, ii ) [44] for Spanish, and iii ) [40] for English. Using machine-learning techniques, for example, perceptron, as in [45] for French.…”
Section: Related Workmentioning
confidence: 99%
“…One possible significant contribution could be in the field of machine translation. In fact, there have been attempts to develop computer programs capable of analyzing rhetorical structure automatically based on banks of manmade analyses (e.g., da Cunha, San Juan, Torres-Moreno, Cabré, & Sierra, 2012;da Cunha, San Juan, Torres-Moreno, Lloberese, & Castellóne, 2012). If a machine translation system had as its basis the replacement of equivalent rhetorical structures between a source and a translated text, the end products would likely be more satisfactory than they are at present.…”
Section: Discourse Studies and Translationmentioning
confidence: 99%