We consider the problem of questionfocused sentence retrieval from complex news articles describing multi-event stories published over time. Annotators generated a list of questions central to understanding each story in our corpus. Because of the dynamic nature of the stories, many questions are time-sensitive (e.g. "How many victims have been found?") Judges found sentences providing an answer to each question. To address the sentence retrieval problem, we apply a stochastic, graph-based method for comparing the relative importance of the textual units, which was previously used successfully for generic summarization. Currently, we present a topic-sensitive version of our method and hypothesize that it can outperform a competitive baseline, which compares the similarity of each sentence to the input question via IDF-weighted word overlap. In our experiments, the method achieves a TRDR score that is significantly higher than that of the baseline.
As Twitter becomes a more common means for officials to communicate with their constituents, it becomes more important that we understand how officials use these communication tools. Using data from 380 members of Congress' Twitter activity during the winter of 2012, we find that officials frequently use Twitter to advertise their political positions and to provide information but rarely to request political action from their constituents or to recognize the good work of others. We highlight a number of differences in communication frequency between men and women, Senators and Representatives, Republicans and Democrats. We provide groundwork for future research examining the behavior of public officials online and testing the predictive power of officials' social media behavior.
Extractive summaries produced from multiple source documents suffer from an array of problems with respect to text cohesion. In this preliminary study, we seek to understand what problems occur in such summaries and how often. We present an analysis of a small corpus of manually revised summaries and discuss the feasibility of making such repairs automatically. Additionally, we present a taxonomy of the problems that occur in the corpus, as well as the operators which, when applied to the summaries, can address these concerns. This study represents a first step toward identifying and automating revision operators that could work with current summarization systems in order to repair cohesion problems in multidocument summaries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.