Broken rail prevention is critical for ensuring track infrastructure safety. With the increasing availability of rail data, the opportunity for data-driven analyses emerges as a promising avenue for enhancing railroad safety. While previous research has predominantly concentrated on predicting broken rails within the context of freight railroads, the attention afforded to commuter railroads has been limited. To address this research gap, this paper presents an analytical modeling framework based on machine learning (ML) algorithms (including LightGBM, XGBoost, Random Forests, and Logistic Regression) to investigate the occurrence of broken rails on commuter rail segments. It leverages various features such as gradient, curvature, annual traffic, operational speed, and the history of prior rail defects. We use oversampling techniques, including ADASYN, random oversampling, and SMOTE, to address the issue of imbalanced data. This challenge arises due to the majority of commuter rail segments not experiencing any broken rails during the study period, resulting in a small sample size of broken rail instances. The findings indicate that, for the dataset employed in this study, LightGBM, in conjunction with random oversampling, exhibits superior performance. Based on the feature importance results, the critical factors influencing the prediction of broken rail occurrences on this commuter railroad are gradient, operational speed, and prior rail defects.