Psychotherapy represents a broad class of medical interventions received
by millions of patients each year. Unlike most medical treatments, its primary
mechanisms are linguistic; i.e., the treatment relies directly on a conversation
between a patient and provider. However, the evaluation of patient-provider
conversation suffers from critical shortcomings, including intensive labor
requirements, coder error, non-standardized coding systems, and inability to
scale up to larger data sets. To overcome these shortcomings, psychotherapy
analysis needs a reliable and scalable method for summarizing the content of
treatment encounters. We used a publicly-available psychotherapy corpus from
Alexander Street press comprising a large collection of transcripts of
patient-provider conversations to compare coding performance for two machine
learning methods. We used the Labeled Latent Dirichlet Allocation (L-LDA) model
to learn associations between text and codes, to predict codes in psychotherapy
sessions, and to localize specific passages of within-session text
representative of a session code. We compared the L-LDA model to a baseline
lasso regression model using predictive accuracy and model generalizability
(measured by calculating the area under the curve (AUC) from the receiver
operating characteristic (ROC) curve). The L-LDA model outperforms the lasso
logistic regression model at predicting session-level codes with average AUC
scores of .79, and .70, respectively. For fine-grained level coding, L-LDA and
logistic regression are able to identify specific talk-turns representative of
symptom codes. However, model performance for talk-turn identification is not
yet as reliable as human coders. We conclude that the L-LDA model has the
potential to be an objective, scaleable method for accurate automated coding of
psychotherapy sessions that performs better than comparable discriminative
methods at session-level coding and can also predict fine-grained codes.