ObjectiveLimited research has evaluated the utility of machine learning models and longitudinal data from electronic health records (EHR) to forecast mental health outcomes following a traumatic brain injury (TBI). The objective of this study is to assess various data science and machine learning techniques and determine their efficacy in forecasting mental health (MH) conditions among active duty Service Members (SMs) following a first diagnosis of mild traumatic brain injury (mTBI).Materials and MethodsPatient demographics and encounter metadata of 35,451 active duty SMs who have sustained an initial mTBI, as documented within the EHR, were obtained. All encounter records from a year prior and post index mTBI date were collected. Patient demographics, ICD-9-CM and ICD-10 codes, enhanced diagnostic related groups, and other risk factors estimated from the year prior to index mTBI were utilized to develop a feature vector representative of each patient. To embed temporal information into the feature vector, various window configurations were devised. Finally, the presence or absence of mental health conditions post mTBI index date were used as the outcomes variable for the models.ResultsWhen evaluating the machine learning models, neural network techniques showed the best overall performance in identifying patients with new or persistent mental health conditions post mTBI. Various window configurations were tested and results show that dividing the observation window into three distinct date windows [−365:−30, −30:0, 0:14] provided the best performance. Overall, the models described in this paper identified the likelihood of developing MH conditions at [14:90] days post-mTBI with an accuracy of 88.2%, an AUC of 0.82, and AUC-PR of 0.66.DiscussionThrough the development and evaluation of different machine learning models we have validated the feasibility of designing algorithms to forecast the likelihood of developing mental health conditions after the first mTBI. Patient attributes including demographics, symptomatology, and other known risk factors proved to be effective features to employ when training ML models for mTBI patients. When patient attributes and features are estimated at different time window, the overall performance increase illustrating the importance of embedding temporal information into the models. The addition of temporal information not only improved model performance, but also increased interpretability and clinical utility.ConclusionPredictive analytics can be a valuable tool for understanding the effects of mTBI, particularly when identifying those individuals at risk of negative outcomes. The translation of these models from retrospective study into real-world validation models is imperative in the mitigation of negative outcomes with appropriate and timely interventions.