Text streams demand an effective, interactive, and on-the-fly method to explore the dynamic and massive data sets, and meanwhile extract valuable information for visual analysis. In this paper, we propose such an interactive visualization system that enables users to explore streaming-in text documents without prior knowledge of the data. The system can constantly incorporate incoming documents from a continuous source into existing visualization context, which is "physically" achieved by minimizing a potential energy defined from similarities between documents. Unlike most existing methods, our system uses dynamic keyword vectors to incorporate newly-introduced keywords from data streams. Furthermore, we propose a special keyword importance that makes it possible for users to adjust the similarity on-the-fly, and hence achieve their preferred visual effects in accordance to varying interests, which also helps to identify hot spots and outliers. We optimize the system performance through a similarity grid and with parallel implementation on graphics hardware (GPU), which achieves instantaneous animated visualization even for a very large data collection. Moreover, our system implements a powerful user interface enabling various user interactions for in-depth data analysis. Experiments and case studies are presented to illustrate our dynamic system for text stream exploration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.