Depression, a major contributor to the global burden of disease, is an outcome of interest in clinical trials. Researchers and clinicians note that depression often presents differently across cultures, posing challenges in the accurate measurement of depressive symptoms across populations. A commonly used self-administered screening tool to measure depressive symptoms, the Center for Epidemiologic Studies Scale-Depression (CES-D), has been translated into dozens of languages and used in thousands of studies, yet gaps remain in our understanding of its factor structure and invariance across studies and over time in the context of interventions. In this secondary analysis, we sampled six recent trials from lowerand middle-income countries to (a) establish the factor structure of the CES-D, (b) assess measurement invariance of the CES-D across treatment versus control arms and over time, (c) examine cross-study invariance, and (d) identify items that may be driving potential noninvariance. We performed exploratory/ confirmatory factor analysis to establish the factor structure of the CES-D within each trial and used multiple group confirmatory analysis to assess within-study cross-arm/cross-time and cross-study invariance. After removal of positive affect items, a unidimensional model performed equivalently over time and across arms within trials, but exhibited noninvariance across trials, supporting prior literature describing differences in factor structure of the scale across populations. While our findings suggest that the CES-D without positive affect items is a valid measure of depressive symptoms within trials in our sample, caution is warranted in interpreting the findings of meta-analyses and multisite/multicountry studies using the CES-D as an outcome measure.
Public Significance StatementDepression is a global public health problem, yet questions remain about how to measure depressive symptoms accurately across cultures and over time. The CES-D, a screening tool for depressive symptoms, appears to be a valid way to measure changes in depressive symptoms over the course of trials. However, its use may pose challenges when comparing depressive symptoms across diverse populations.