2006
DOI: 10.1007/s10791-006-9017-1
|View full text |Cite
|
Sign up to set email alerts
|

Learning-based summarisation of XML documents

Abstract: Documents formatted in eXtensible Markup Language (XML) are available in collections of various document types. In this paper, we present an approach for the summarisation of XML documents. The novelty of this approach lies in that it is based on features not only from the content of documents, but also from their logical structure. We follow a machine learning, sentence extraction-based summarisation technique. To find which features are more effective for producing summaries, this approach views sentence ext… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2008
2008
2014
2014

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 29 publications
0
4
0
Order By: Relevance
“…Amini et al [2] aim at extracting the most important sentences of an XML document, using the structure of the text as additional features. Ramanath et al [16,17] propose methods for summarizing tree-structured XML documents within a constrained budget, focusing on data-centric XML such as IMDB or DBLP.…”
Section: Related Workmentioning
confidence: 99%
“…Amini et al [2] aim at extracting the most important sentences of an XML document, using the structure of the text as additional features. Ramanath et al [16,17] propose methods for summarizing tree-structured XML documents within a constrained budget, focusing on data-centric XML such as IMDB or DBLP.…”
Section: Related Workmentioning
confidence: 99%
“…The pervasive use of the XML format on the World Wide Web has motivated much research in the area of XML document retrieval, considering both content and structure of documents leading to structure-aware retrieval [13,14]. The nesting level in an XML tree is an example of a structural feature used to express the degree of relevance of a keyword [15,16]. While the efforts in the area of XML document retrieval do not deal with the unique characteristics of presentation slides, they motivate the incorporation of structural features, such as indentation depth, in slide retrieval.…”
Section: Overview Of Contributions and Related Workmentioning
confidence: 99%
“…However, the use of XML markup in text documents to improve summarization quality has been previously studied [1]. These recent techniques still deal with document-oriented XML -text documents augmented with XML markup -and not data-oriented XML where markup is used as additional input to the document summarization process.…”
Section: Related Workmentioning
confidence: 99%