Abstract⎯ Local Coherence is a very important aspect in multidocument summarization, since good summaries not only condense the most relevant information, but also present it in a well-organized structure. One of the most investigated models for local coherence is the Entity-based model, which has been successfully used, once it facilitates the computational approach for coherence measurement. Particularly, this model was used for the evaluation of local coherence in multi-document summaries, achieving promising results. In order to improve the potential of the Entity-based model, we propose the creation of a language model for multi-document summaries that integrates the Entity-based model with discourse knowledge, mainly from Cross-document Structure Theory. Our results show that this type of information enriches the Entity-based Model by capturing other phenomena that are inherent to multi-document summaries, such as redundancy and complementarity, which improves the performance of the original model.
Multi-document summarization is a very important area of Natural Language Processing (NLP) nowadays because of the huge amount of data in the web. People want more and more information and this information must be coherently organized and summarized. The main focus of this paper is to deal with the coherence of multi-document summaries. Therefore, a model that uses discursive information to automatically evaluate local coherence in multi-document summaries has been developed. This model obtains 92.69% of accuracy in distinguishing coherent from incoherent summaries, outperforming the state of the art in the area.
Aspect-based opinion summarization is the task of automatically generating a summary for some aspects of a specific topic from a set of opinions. In most cases, to evaluate the quality of the automatic summaries, it is necessary to have a reference corpus of human summaries to analyze how similar they are. The scarcity of corpora in that task has been a limiting factor for many research works. In this paper, we introduce OpiSums-PT, a corpus of extractive and abstractive summaries of opinions written in Brazilian Portuguese. We use this corpus to analyze how similar human summaries are and how people take into account the issues of aspect coverage and sentiment orientation to generate manual summaries. The results of these analyses show that human summaries are diversified and people generate summaries only for some aspects, keeping the overall sentiment orientation with little variation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.