In this study, we aimed to identify a core set of Yin Deficiency Pattern (YDP) genes for predicting colorectal
cancer (CRC) and to construct reliable machine learning models optimized by Optuna. Comprehensive analysis was performed
on nine datasets, totaling 1,680 samples. A CRC diagnostic prediction model was developed by comparing 21 machine
learning classification models with Optuna hyperparameter optimization and validated across six independent external
cohorts. Additionally, the expression patterns of the core diagnostic genes were experimentally validated.
Linear Discriminant Analysis (LDA), along with four other machine learning models (please specify these models), ranked as the top five performing models
across six cohorts, demonstrating superior performance with AUC values exceeding 0.99 and all other performance metrics surpassing 0.899. This study marks
the first utilization of four specific novel machine learning models (again, please specify these models) in CRC diagnosis.
The robust performance of the top models across multiple external validation sets underscores the reliability and generalizability
of our diagnostic model. These results hold potential implications for the development of personalized medicine approaches
in CRC treatment, offering a new avenue for early detection and prognosis improvement.