Theory-driven text analysis has made extensive use of psychological concept dictionaries, leading to a wide range of important results. These dictionaries have generally been applied through word count methods which have proven to be both simple and effective. In this paper, we introduce Distributed Dictionary Representations (DDR), a method that applies psychological dictionaries using semantic similarity rather than word counts. This allows for the measurement of the similarity between dictionaries and spans of text ranging from complete documents to individual words. We show how DDR enables dictionary authors to place greater emphasis on construct validity without sacrificing linguistic coverage. We further demonstrate the benefits of DDR on two real-world tasks and finally conduct an extensive study of the interaction between dictionary size and task performance. These studies allow us to examine how DDR and word count methods complement one another as tools for applying concept dictionaries and where each is best applied. Finally, we provide references to tools and resources to make this method both available and accessible to a broad psychological audience. Keywords Methodological innovation · Text analysis · Semantic representation · Dictionary-based text analysisElectronic supplementary material The online version of this article
Does sharing moral values encourage people to connect and form communities? The importance of moral homophily (love of same) has been recognized by social scientists, but the types of moral similarities that drive this phenomenon are still unknown. Using both large-scale, observational social-media analyses and behavioral lab experiments, the authors investigated which types of moral similarities influence tie formations. Analysis of a corpus of over 700,000 tweets revealed that the distance between 2 people in a social-network can be predicted based on differences in the moral purity content-but not other moral content-of their messages. The authors replicated this finding by experimentally manipulating perceived moral difference (Study 2) and similarity (Study 3) in the lab and demonstrating that purity differences play a significant role in social distancing. These results indicate that social network processes reflect moral selection, and both online and offline differences in moral purity concerns are particularly predictive of social distance. This research is an attempt to study morality indirectly using an observational big-data study complemented with 2 confirmatory behavioral experiments carried out using traditional social-psychology methodology.
The syntax and semantics of human language can illuminate many individual psychological differences and important dimensions of social interaction. Accordingly, psychological and psycholinguistic research has begun incorporating sophisticated representations of semantic content to better understand the connection between word choice and psychological processes. In this work we introduce ConversAtion level Syntax SImilarity Metric (CASSIM), a novel method for calculating conversation-level syntax similarity. CASSIM estimates the syntax similarity between conversations by automatically generating syntactical representations of the sentences in conversation, estimating the structural differences between them, and calculating an optimized estimate of the conversation-level syntax similarity. After introducing and explaining this method, we report results from two method validation experiments (Study 1) and conduct a series of analyses with CASSIM to investigate syntax accommodation in social media discourse (Study 2). We run the same experiments using two well-known existing syntactic metrics, LSM and Coh-Metrix, and compare their results to CASSIM. Overall, our results indicate that CASSIM is able to reliably measure syntax similarity and to provide robust evidence of syntax accommodation within social media discourse.
Recent interest in distributed vector representations for words has resulted in an increased diversity of approaches, each with strengths and weaknesses. We demonstrate how diverse vector representations may be inexpensively composed into hybrid representations, effectively leveraging strengths of individual components, as evidenced by substantial improvements on a standard word analogy task. We further compare these results over different sizes of training sets and find these advantages are more pronounced when training data is limited. Finally, we explore the relative impacts of the differences in the learning methods themselves and the size of the contexts they access.
As human activity and interaction increasingly take place online, the digital residues of these activities provide a valuable window into a range of psychological and social processes. A great deal of progress has been made toward utilizing these opportunities; however, the complexity of managing and analyzing the quantities of data currently available has limited both the types of analysis used and the number of researchers able to make use of these data. Although fields such as computer science have developed a range of techniques and methods for handling these difficulties, making use of those tools has often required specialized knowledge and programming experience. The Text Analysis, Crawling, and Interpretation Tool (TACIT) is designed to bridge this gap by providing an intuitive tool and interface for making use of state-of-the-art methods in text analysis and large-scale data management. Furthermore, TACIT is implemented as an open, extensible, plugin-driven architecture, which will allow other researchers to extend and expand these capabilities as new methods become available.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.