No abstract
In this paper, we propose an extractive multi-document summarization (MDS) system using joint optimization and active learning for content selection grounded in user feedback. Our method interactively obtains user feedback to gradually improve the results of a state-of-the-art integer linear programming (ILP) framework for MDS. Our methods complement fully automatic methods in producing highquality summaries with a minimum number of iterations and feedbacks. We conduct multiple simulation-based experiments and analyze the effect of feedbackbased concept selection in the ILP setup in order to maximize the user-desired content in the summary.
Live blogs are an increasingly popular news format to cover breaking news and live events in online journalism. Online news websites around the world are using this medium to give their readers a minute by minute update on an event. Good summaries enhance the value of the live blogs for a reader, but are often not available. In this article, (a) we first define the task of summarizing a live blog, (b) study ways of automatically collecting corpora for live blog summarization, and (c) understand the complexity of the task by empirically evaluating well-known state-of-the-art unsupervised and supervised summarization systems on our new corpus. We show that live blog summarization poses new challenges in the field of news summarization, since frequency and positional signals cannot be used. We make our tools publicly available to reconstruct the corpus and to conduct our empirical experiments. This encourages the research community to build upon and replicate our results.
There exists an ever-growing set of data-centric systems that allow data scientists of varying skill levels to interactively manipulate, analyze and explore large structured data sets. However, there are currently not many systems that allow data scientists and novice users to interactively explore large unstructured text document collections from heterogeneous sources. In this demo paper, we present a new system for interactive text summarization called Sherlock. The task of automatically producing textual summaries is an important step to understand a collection of multiple topic-related documents. It has many real-world applications in journalism, medicine, and many more. However, none of the existing summarization systems allow users to provide feedback at interactive speed. We therefore integrate a new approximate summarization model into Sherlock that can guarantee interactive speeds even for large text collections to keep the user engaged in the process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.