Phone: +91 26743007Fax: +91 26742316
Email -dinesh@icgeb.res.in (DG)Early detection of breast cancer and its correct stage determination are important for prognosis and rendering appropriate personalized clinical treatment to breast cancer patients.However, despite considerable efforts and progress, there is a need to identify the specific genomic factors responsible for, or accompanying Invasive Ductal Carcinoma (IDC) progression stages, which can aid the determination of the correct cancer stages. We have developed two-class machine-learning classification models to differentiate the early and late stages of invasive ductal carcinoma. The prediction models are trained with RNA-seq gene expression profiles representing different IDC stages of 610 patients, obtained from The Cancer Genome Atlas (TCGA). Different supervised learning algorithms were trained and evaluated with an enriched model learning, facilitated by different feature selection methods.We also developed a machine-learning classifier trained on the same datasets with training sets reduced data corresponding to IDC driver genes. Based on these two classifiers, we have developed a web-server Duct-BRCA-CSP to predict early stage from late stages of IDC based on input RNA-seq gene expression profiles. The analysis conducted by us also enables deeper insights into the stage-dependent molecular events accompanying breast ductal carcinoma progression. The server is publicly available at http://bioinfo.icgeb.res.in/duct-BRCA-CSP.
Key PointsDifferent supervised machine-learning algorithms such as Random Forest, SVM and Naive Bayes were trained with enriched features of the TCGA RNA-seq datasets selected for the study.We have developed two-class classification models, trained with relevant gene expression profiles to efficiently discriminate between the early and late IDC stages.Finally, we also developed a web server using python scikit-learn to provide freely available GUI based access to the machine learning models developed by us. The server is publicly available at http://bioinfo.icgeb.res.in/duct-BRCA-CSP.