Background
MicroRNAs (miRNAs) have shown potential as diagnostic biomarkers for myocardial infarction (MI) due to their early dysregulation and stability in circulation after MI. Moreover, they play a crucial role in regulating adaptive and maladaptive responses in cardiovascular diseases, making them attractive targets for potential biomarkers. However, their potential as novel biomarkers for diagnosing cardiovascular diseases requires systematic evaluation.
Methods
This study aimed to identify a miRNA biomarker panel for early-stage MI detection using bioinformatics and machine learning (ML) methods. miRNA expression data were obtained for early-stage MI patients and healthy controls from the Gene Expression Omnibus. Separate datasets were allocated for training and independent testing. Differential expression analysis was performed to identify dysregulated miRNAs in the training set. The least absolute shrinkage and selection operator (LASSO) was applied for feature selection to prioritize relevant miRNAs associated with MI. The selected miRNAs were used to develop ML models including support vector machine, Gradient Boosted, XGBoost, and a hard voting ensemble (HVE).
Results
Differential expression analysis discovered 99 dysregulated miRNAs in the training set. LASSO feature selection prioritized 21 miRNAs. Ten miRNAs were identified in both the LASSO subset and independent test set. The HVE model trained with the selected miRNAs achieved an accuracy of 0.86 and AUC of 0.83 on the independent test set.
Conclusions
An integrated framework for robust miRNA selection from omics data shows promise for developing accurate diagnostic models for early-stage MI detection. The HVE model demonstrated good performance despite differences between training and test datasets.