Objective
We set forth to build a prediction model of individuals who would develop bipolar disorder (BD) using machine learning techniques in a large birth cohort.
Methods
A total of 3748 subjects were studied at birth, 11, 15, 18, and 22 years of age in a community birth cohort. We used the elastic net algorithm with 10‐fold cross‐validation to predict which individuals would develop BD at endpoint (22 years) at each follow‐up visit before diagnosis (from birth up to 18 years). Afterward, we used the best model to calculate the subgroups of subjects at higher and lower risk of developing BD and analyzed the clinical differences among them.
Results
A total of 107 (2.8%) individuals within the cohort presented with BD type I, 26 (0.6%) with BD type II, and 87 (2.3%) with BD not otherwise specified. Frequency of female individuals was 58.82% (n = 150) in the BD sample and 53.02% (n = 1868) among the unaffected population. The model with variables assessed at the 18‐year follow‐up visit achieved the best performance: AUC 0.82 (CI 0.75–0.88), balanced accuracy 0.75, sensitivity 0.72, and specificity 0.77. The most important variables to detect BD at the 18‐year follow‐up visit were suicide risk, generalized anxiety disorder, parental physical abuse, and financial problems. Additionally, the high‐risk subgroup of BD showed a high frequency of drug use and depressive symptoms.
Conclusions
We developed a risk calculator for BD incorporating both demographic and clinical variables from a 22‐year birth cohort. Our findings support previous studies in high‐risk samples showing the significance of suicide risk and generalized anxiety disorder prior to the onset of BD, and highlight the role of social factors and adverse life events.
Background
Depression is highly prevalent and marked by a chronic and recurrent course. Despite being a major cause of disability worldwide, little is known regarding the determinants of its heterogeneous course. Machine learning techniques present an opportunity to develop tools to predict diagnosis and prognosis at an individual level.
Methods
We examined baseline (2008–2010) and follow-up (2012–2014) data of the Brazilian Longitudinal Study of Adult Health (ELSA-Brasil), a large occupational cohort study. We implemented an elastic net regularization analysis with a 10-fold cross-validation procedure using socioeconomic and clinical factors as predictors to distinguish at follow-up: (1) depressed from non-depressed participants, (2) participants with incident depression from those who did not develop depression, and (3) participants with chronic (persistent or recurrent) depression from those without depression.
Results
We assessed 15 105 and 13 922 participants at waves 1 and 2, respectively. The elastic net regularization model distinguished outcome levels in the test dataset with an area under the curve of 0.79 (95% CI 0.76–0.82), 0.71 (95% CI 0.66–0.77), 0.90 (95% CI 0.86–0.95) for analyses 1, 2, and 3, respectively.
Conclusions
Diagnosis and prognosis related to depression can be predicted at an individual subject level by integrating low-cost variables, such as demographic and clinical data. Future studies should assess longer follow-up periods and combine biological predictors, such as genetics and blood biomarkers, to build more accurate tools to predict depression course.
Bipolar disorder (BD) is one of the most disabling diseases characterized by severe humor fluctuation. It is accompanied by cognitive and functional impairment in addiction to high suicide rates. BD is often underdiagnosed and treated incorrectly because many of the reported symptoms are not exclusive to the disorder. Once the diagnosis is exclusively clinical, it is not possible to state precisely. From that, proteomic approaches were used to identify, in a large scale, all proteins involved in cellular or tissue processes. This review aggregate data from blood proteomes, by using protein association network, of subjects with BD and healthy controls to suggest dysfunctional molecular pathways involved in disease. Original articles containing proteomic analysis were searched in PubMed. Seven studies were selected and data were extracted for posterior analysis. A protein-protein interaction network was created by STRING database. A final set of proteins in this network were employed as input in ClueGO and, the main biological process was visualized using R package pathview. The analysis revealed proteins associated with many biological processes, including growth and endocrine regulation, iron transportation, protease inhibition, protection against pathogens and cholesterol transport. Moreover, pathway analysis indicated the association of uncovered proteins with two main metabolic pathways: complement system and coagulation cascade. Thus, a better understanding on the pathophysiology of psychiatric disorders and the identification of potential biomarker candidates are essential to improve diagnostic, prognostic and design pharmacological strategies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.