The main purpose of this paper is to process key information in medical text records and also classify patients, per different levels of Breast Imaging-Reporting and Data system (BI-RADS). The BI-RADS is a scheme for the standardization of breast imaging reports. Therefore, medical text mining is employed to classify mammography reports supported BI-RADS. In this research, a new method is proposed for automated BI-RADS classifications extraction from textual reports and improves the therapeutic procedures. At first, a mammography lexicon is employed for choosing keywords from medical text reports. Word2vec and Term Frequency Inverse Document Frequency (TFIDF) techniques are used for extracting features, finally, they're combined with the Hospital Information System (HIS) reports and called With-HIS. The different classifiers like multi-class Support Vector Machine (SVM), Naïve Bayesian (NB), Extreme Gradient Boosting (XGBoost), and Multi-Level Fuzzy Min-Max Neural Network (MLF) are used so as to compare the accuracy of With-HIS and Without HIS (called Without-HIS). The results are confirmed that using HIS beside the proposed approach (Word2vec +TFIDF) encompasses a significant effect on the accuracy of medical text classification. Accuracy within the proposed method with MLF classifier is 0.89% but without-HIS is 0.85%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.