The amount of scholarly data has been increasing dramatically over the last years. For newcomers to a particular science domain (e.g., IR, physics, NLP) it is often difficult to spot larger trends and to position the latest research in the context of prior scientific achievements and breakthroughs. Similarly, researchers in the history of science are interested in tools that allow them to analyze and visualize changes in particular scientific domains. Temporal summarization and related methods should be then useful for making sense of large volumes of scientific discourse data aggregated over time. We demonstrate a novel approach to analyze the collections of research papers published over longer time periods to provide a high level overview of important semantic changes that occurred over the progress of time. Our approach is based on comparing word semantic representations over time and aims to support users in better understanding of large domain-focused archives of scholarly publications. As an example dataset we use the ACL Anthology Reference Corpus that spans from 1979 to 2015 and contains 22,878 scholarly articles.
The number of scholarly publications has dramatically increased over the last decades. For anyone new to a particular science domain it is not easy to understand the major trends and significant changes that the domain has undergone over time. Temporal summarization and related approaches should be then useful to make sense of scholarly temporal collections. In this paper we demonstrate an approach to analyze the dataset of research papers by providing a high level overview of important changes that occurred over time in this dataset. The novelty of our approach lies in the adaptation of methods used for semantic term evolution analysis. However, we analyze not just semantic evolution of single words independently, but we estimate common semantic drifts shared by groups of semantically converging words. As an example dataset we use the ACL Anthology Reference Corpus that spans from 1974 to 2015 and contains 22,878 scholarly articles.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.