Topic models have been applied to everything from books to newspapers to social media posts in an effort to identify the most prevalent themes of a text corpus. We provide an in-depth analysis of unsupervised topic models from their inception to today. We trace the origins of different types of contemporary topic models, beginning in the 1990s, and we compare their proposed algorithms, as well as their different evaluation approaches. Throughout, we also describe settings in which topic models have worked well and areas where new research is needed, setting the stage for the next generation of topic models.
When U.S. presidential candidates misrepresent the facts, their claims get discussed across media streams, creating a lasting public impression. We show this through a public performance: the 2020 presidential debates. For every five newspaper articles related to the presidential candidates, President Donald J. Trump and Joseph R. Biden Jr., there was one mention of a misinformation-related topic advanced during the debates. Personal attacks on Biden and election integrity were the most prevalent topics across social media, newspapers, and TV. These two topics also surfaced regularly in voters’ recollections of the candidates, suggesting their impression lasted through the presidential election.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.