Background: Due to the multidimensional, multilayered, and chronological order of the cancer data in this study, it was challenging for us to extract treatment paths. Therefore, it was necessary to design a new data mining scheme to effectively extract the treatment path of breast cancer. To determine whether the cSPADE algorithm and system clustering proposed in this study can effectively identify the treatment pathways for early breast cancer.Methods: We applied data mining technology to the electronic medical records of 6891 early breast cancer patients to mine treatment pathways. We provided a method of extracting data from EMR and performed three-stage mining: determining the treatment stage through the cSPADE algorithm → system clustering for treatment plan extraction → cSPADE mining sequence pattern for treatment. The Kolmogorov-Smirnov test and correlation analysis were used to cross-validate the sequence rules of early breast cancer treatment pathways.Results: We unearthed 55 sequence rules for early breast cancer treatment, 3 preoperative neoadjuvant chemotherapy regimens, 3 postoperative chemotherapy regimens, and 2 chemotherapy regimens for patients without surgery. Through 5fold cross-validation, Pearson and Spearman correlation tests were performed. At the signi cance level of P <0.05, all correlation coe cients of support, con dence and lift were greater than 0.89. Using the Kolmogorov-Smirnov test, we found no signi cant differences between the sequence distributions.Conclusions: The cSPADE algorithm combined with system clustering can achieve hierarchical and vertical mining of breast cancer treatment models. By uncovering the treatment pathways of early breast cancer patients by this method, the real-world breast cancer treatment behavior model can be evaluated, and it can provide a reference for the redesign and optimization of the treatment pathways.technology have appeared [8][9][10][11] . The development of electronic medical records (EMR) provides the possibility for the extraction and optimization of treatment paths [12][13][14] . Most process mining algorithms can automatically build process patterns, which are very suitable for understanding and can be used for process optimization [15][16][17] . Recently, many studies have focused on developing sequential pattern mining methods to discover real-world treatment behavior patterns from clinical data, which has become a research hotspot [18][19][20] . However, current research focuses on the mining of drug treatment models [21][22][23] .Due to the multidimensional, multilayered, and chronological order of the cancer data in this study, it was challenging for us to extract treatment paths. Therefore, it was necessary to design a new data mining scheme to effectively extract the treatment path of breast cancer. Sequence data consist of a series of ordered elements or events and may not include speci c time concepts, such as customer shopping sequences, website click streams and biological sequences. This type of data does not process data at ...