Proceedings of the ACL Student Research Workshop on - ACL '05 2005
DOI: 10.3115/1628960.1628982
|View full text |Cite
|
Sign up to set email alerts
|

Learning information structure in the Prague treebank

Abstract: This paper investigates the automatic identification of aspects of Information Structure (IS) in texts. The experiments use the Prague Dependency Treebank which is annotated with IS following the Praguian approach of Topic Focus Articulation. We automatically detect t(opic) and f(ocus), using node attributes from the treebank as basic features and derived features inspired by the annotation guidelines. We show the performance of C4.5, Bagging, and Ripper classifiers on several classes of instances such as noun… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2006
2006
2006
2006

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 11 publications
0
1
0
Order By: Relevance
“…The interest for corpora annotated with information structure has been raised recently by several authors. Kruijff-Korbayová and Kruijff (2004) describe a method where a rich discourselevel annotation is used to investigate information structure, while both Postolache (2005) and Diderichsen and Elming (2005) study the application of machine learning to the problem of automatic identification of topic and focus. In this study, on the contrary, information structure is annotated manually, and the annotation is used to investigate the correlation between information structure tags and intra-clausal pauses.…”
Section: Introductionmentioning
confidence: 99%
“…The interest for corpora annotated with information structure has been raised recently by several authors. Kruijff-Korbayová and Kruijff (2004) describe a method where a rich discourselevel annotation is used to investigate information structure, while both Postolache (2005) and Diderichsen and Elming (2005) study the application of machine learning to the problem of automatic identification of topic and focus. In this study, on the contrary, information structure is annotated manually, and the annotation is used to investigate the correlation between information structure tags and intra-clausal pauses.…”
Section: Introductionmentioning
confidence: 99%