Objective: Early prediction of breast cancer is one of the most essential fields of medicine. Many studies have introduced prediction approaches to facilitate the early prediction and estimate the future occurrence based on mammography periodic tests. In the current research, we introduce a novel machine learning tool for the early prediction of breast cancer. Methods: Three basic resources are used to identify the most essential risk factors; including the BCSC (Breast Cancer Surveillance Consortium) dataset, a medical questionnaire, and multiple international breast cancer reports. The BCSC dataset has been normalized and balanced; consequently, the questionnaire and the medical reports are analyzed in order to define the degree of importance and a potential weight factor of each risk factor. These weights are used to scale risk factors and then the optimizable tree-based ML model is trained using the balanced weighted risk factors datasets. Results: Three balanced versions of the BCSC dataset are used; oversampled, down-sampled and mixed datasets. Each risk factor has a weight (1, 2 or 4) assigned based on a mathematical modelling of the questionnaire and the international breast cancer reports. The experiments are applied on the weighted and non-weighted versions of the database, and they indicate that the performance increases significantly by using the weighted version of the risk factors. The tests prove that the down-weighting of the non-essential risk factor increases the accuracy and reduces errors. The overall accuracy of the weighted balanced datasets reaches 100%, 95.8% and 95.9% for down-sampled, oversampled and mixed datasets respectively. Conclusion: Weighting the risk factors of the BCSC dataset improves the performance by increasing the accuracy and reducing the false rejection and false discovery rates for all versions of balanced datasets. The weighting approach can also be used to improve the estimation score of breast cancer by scaling the individual scores of risk factors.
A new approach of human recognition using ear images is introduced. It consists of two basic steps which are the ear segmentation and ear recognition. In the first one, Likelihood skin detector is used to determine the skin areas in the side face images. Then, some of the morphological operations are applied to determine the ear region. This region is extracted using image processing techniques. The ear recognition step depends on the segmented ear images as inputs. A hybrid PCA_Wavelet algorithm is used to extract the ear features from ear. Finally, the feed forwarding back propagation neural network is trained using the feature vectors. Tests which applied on 460 images, which have been taken during 4 months and under different illumination and pose variations, show that the system achieved a rate of 96.73% for ear extraction and 98.9% for recognition. More experiments are done to specify the best wavelet level, the best number of features, the best classification method, and the best threshold value. The study is also compared with other ones at the area of ear recognition.
Breast cancer prediction is essential for preventing and treating cancer. In this research, a novel breast cancer prediction model is introduced. In addition, this research aims to provide a range-based cancer score instead of binary classification results (yes or no). The Breast Cancer Surveillance Consortium dataset (BCSC) dataset is used and modified by applying a proposed probabilistic model to achieve the range-based cancer score. The suggested model analyses a sub dataset of the whole BCSC dataset, including 67632 records and 13 risk factors. Three types of statistics are acquired (general cancer and non-cancer probabilities, previous medical knowledge, and the likelihood of each risk factor given all prediction classes). The model also uses the weighting methodology to achieve the best fusion of the BCSC's risk factors. The computation of the final prediction score is done using the post probability of the weighted combination of risk factors and the three statistics acquired from the probabilistic model. This final prediction is added to the BCSC dataset, and the new version of the BCSC dataset is used to train an ensemble model consisting of 30 learners. The experiments are applied using the sub and the whole datasets (including 317880 medical records). The results indicate that the new range-based model is accurate and robust with an accuracy of 91.33%, a false rejection rate of 1.12%, and an AUC of 0.9795. The new version of the BCSC dataset can be used for further research and analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.