Background: Since image-based fracture prediction models using deep learning are lacking, we aimed to develop an X-ray-based fracture prediction model using deep learning with longitudinal data. Methods: This study included 1,595 participants aged 50 to 75 years with at least two lumbosacral radiographs without baseline fractures from 2010 to 2015 at Seoul National University Hospital. Positive and negative cases were defined according to whether vertebral fractures developed during follow-up. The cases were divided into training (n=1,416) and test (n=179) sets. A convolutional neural network (CNN)-based prediction algorithm, DeepSurv, was trained with images and baseline clinical information (age, sex, body mass index, glucocorticoid use, and secondary osteoporosis). The concordance index (C-index) was used to compare performance between DeepSurv and the Fracture Risk Assessment Tool (FRAX) and Cox proportional hazard (CoxPH) models. Results: Of the total participants, 1,188 (74.4%) were women, and the mean age was 60.5 years. During a mean follow-up period of 40.7 months, vertebral fractures occurred in 7.5% (120/1,595) of participants. In the test set, when DeepSurv learned with images and clinical features, it showed higher performance than FRAX and CoxPH in terms of C-index values (DeepSurv, 0.612; 95% confidence interval [CI], 0.571 to 0.653; FRAX, 0.547; CoxPH, 0.594; 95% CI, 0.552 to 0.555). Notably, the DeepSurv method without clinical features had a higher C-index (0.614; 95% CI, 0.572 to 0.656) than that of FRAX in women. Conclusion: DeepSurv, a CNN-based prediction algorithm using baseline image and clinical information, outperformed the FRAX and CoxPH models in predicting osteoporotic fracture from spine radiographs in a longitudinal cohort.