In this summary, we discuss our approach to the CLPsych Shared Task and its initial results. For our predictions in each task, we used a recursive partitioning algorithm (decision trees) to select from our set of features, which were primarily dictionary scores and counts of individual words. We focused primarily on Task A, which aimed to predict suicide risk, as rated by a team of expert clinicians (Shing et al., 2018), based on language used in SuicideWatch posts on Reddit. Category-level findings highlight the potential importance of social and moral language categories. Word-level correlates of risk levels underline the value of fine-grained data-driven approaches, revealing both theory-consistent and potentially novel correlates of suicide risk that may motivate future research.
People tend to like stimuli-ranging from human faces to text-that are prototypical, and thus easily processed. However, recent research has suggested that less typical stimuli may be preferred in creative contexts, such as fine art or music lyrics. In an archival sample of movie scripts, we tested whether genre-typicality predicted film ratings as a function of rater role (novice audience member or expert film critic). Genre-typicality was operationalized as the profile correlations between linguistic arcs (across five segments, or acts) for each script and within-genre averages. We predicted (1) that critics would prefer more disfluent (genreatypical) films and general audiences would prefer fluent (genre-typical) films, and (2) that these differences would be most pronounced for genres expected to be more entertaining (e.g., action/adventure) than challenging (e.g., tragedy). Partly consistent with our hypotheses, the results showed that critics gave higher ratings to action/adventure films with less typical positive emotion arcs. However, regardless of audience-member or professional-critic status, higher ratings were attributed to films that were more genre-atypical (or disfluent), in terms of analytic thinking, narrative action, and emotional tone, across all genres except family/kids films. Such findings support the growing literature on the appeal of disfluency in the arts and have relevance for researchers in psychology and computer science who are interested in computational linguistic approaches to attitudes, film, and literature.
Depression is characterized by a selffocused negative attentional bias, which is often reflected in everyday language use. In a prospective writing study, we explored whether the association between depressive symptoms and negative, selffocused language varies across social contexts. College students (N = 243) wrote about a recent interaction with a person they care deeply about. Depression symptoms positively correlated with negative emotion words and first-person singular pronouns (or negative self-focus) when writing about a recent interaction with romantic partners or, to a lesser extent, friends, but not family members. The pattern of results was more pronounced when participants perceived greater self-other overlap (i.e., interpersonal closeness) with their romantic partner. Findings regarding how the linguistic profile of depression differs by type of relationship may inform more effective methods of clinical diagnosis and treatment.
Practitioners in many domains–e.g., clinical psychologists, college instructors, researchers–collect written responses from clients. A well-developed method that has been applied to texts from sources like these is the computer application Linguistic Inquiry and Word Count (LIWC). LIWC uses the words in texts as cues to a person’s thought processes, emotional states, intentions, and motivations. In the present study, we adopt analytic principles from LIWC and develop and test an alternative method of text analysis using naïve Bayes methods. We further show how output from the naïve Bayes analysis can be used for mark up of student work in order to provide immediate, constructive feedback to students and instructors. References Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993-1022. Boot, P., Zijlstra, H., & Geenen, R. (2017). The Dutch translation of the Linguistic Inquiry and Word Count (LIWC) 2007 dictionary. Dutch Journal of Applied Linguistics, 6(1), 65-76. Chung, C. K., & Pennebaker, J. W. (2008). Revealing dimensions of thinking in open-ended self-descriptions: An automated meaning extraction method for natural language. Journal of research in personality, 42(1), 96-132. Hsieh, H-F., & Shannon, S. E. (2005).Three approaches to qualitative content analysis. Qualitative health research, 15(9), 277-1288. Kintsch, W. (1998). Comprehension: A paradigm for cognition. New York: Cambridge University Press. Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse processes, 25(2-3), 259-284. Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2), 203-208. Massó, G., Lambert, P., Penagos, C. R., & Saurí, R. (2013, December). Generating New LIWC Dictionaries by Triangulation. In Asia Information Retrieval Symposium (pp. 263-271). Springer, Berlin, Heidelberg. Newman, M., Groom, C.J., Handelman, L.D., & Pennebaker, J.W. (2008). Gender differences in language use: An analysis of 14,000 text samples. Discourse Processes, 45(3), 211-236. Pennebaker, J.W., Boyd, R.L., Jordan, K., & Blackburn, K. (2015). The development and psychometric properties of LIWC 2015. Austin, TX: University of Texas at Austin. Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of language and social psychology, 29(1), 24-54. Van Wissen, L., & Boot, P. (2017, September). An Electronic Translation of the LIWC Dictionary into Dutch. In: Electronic lexicography in the 21st century: Proceedings of eLex 2017 Conference. (pp. 703-715). Lexical Computing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.