The classification of benign and malignant masses in mammograms by Computer-Aided Diagnosis (CAD) is one of the most difficult and important tasks in the development of CAD systems. This classification has commonly been automated by extracting a set of handcrafted features from mammograms and relating the responses to breast cancer. Recently, the application of Deep Learning (DL) technology in medical imaging informatics has been attracting extensive research interest. However, limited medical image datasets and feature expression often reduce the performance of DL-based schemes. Therefore, this study aims to develop a new combined feature CAD method based on DL for classifying mammographic masses into three classes: normal, benign and cancer (malignant) masses. Three kinds of breast masses were scored by using Deep Convolution Neural Network (DCNN) as a feature extractor. Then the scoring features are combined with the image texture features as input to the classifier. This features including the scoring features, Gray-Level Co-occurrence Matrix (GLCM) and Histogram of Oriented Gradient (HOT) were employed to extract the breast mass information in mammograms and the classifier of Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost) were trained for the classification task. Accuracy (ACC), Precision (Pre), Recall (Rec), F 1-score (F 1), and Overall Accuracy (Overall ACC) are used to evaluate the performance of the proposed system and the results show that the proposed multi-features combination model performs the best results. The performance of the XGBoost classifier has proved to be better in comparison to the SVM classification algorithms. As a result, when XGBoost was used as a classifier, the correct identification rate of the Overall ACC was 92.80% and that of malignant tumors was 84%, with reasonable and best results. These results indicate that the proposed method may help in more accurately diagnosing cases that are difficult to classify on images. INDEX TERMS Deep learning, computer-aided diagnosis, deep convolution neural network, mammograms classification.
Featured Application: A support vector machine was used to achieve the best jackknife and the 5-fold cross-validation outcomes for identifying piRNAs (Piwi-interacting RNA) by combining these multiple features.Abstract: Piwi-interacting RNA (piRNA) is a newly identified class of small non-coding RNAs. It can combine with PIWI proteins to regulate the transcriptional gene silencing process, heterochromatin modifications, and to maintain germline and stem cell function in animals. To better understand the function of piRNA, it is imperative to improve the accuracy of identifying piRNAs. In this study, the sequence information included the single nucleotide composition, and 16 dinucleotides compositions, six physicochemical properties in RNA, the position specificities of nucleotides both in N-terminal and C-terminal, and the proportions of the similar peptide sequence of both N-terminal and C-terminal in positive and negative samples, which were used to construct the feature vector. Then, the F-Score was applied to choose an optimal single type of features. By combining these selected features, we achieved the best results on the jackknife and the 5-fold cross-validation running 10 times based on the support vector machine algorithm. Moreover, we further evaluated the stability and robustness of our new method.
Protein S-nitrosylation (SNO) is a typical reversible, redox-dependent and post-translational modification that involves covalent modification of cysteine residues with nitric oxide (NO) for the thiol group. Numerous experiments have shown that SNO plays a major role in cell function and pathophysiology. In order to rapidly analysis the big sets of data, the computing methods for identifying the SNO sites are being considered as necessary auxiliary tools. In this study, multiple features including Parallel correlation pseudo amino acid composition (PC-PseAAC), Basic kmer1 (kmer1), Basic kmer2 (kmer2), General parallel correlation pseudo amino acid composition (PC-PseAAC_G), Adapted Normal distribution Bi-Profile Bayes (ANBPB), Double Bi-Profile Bayes (DBPB), Bi-Profile Bayes (BPB), Incorporating Amino Acid Pairwise (IAAPair) and Position-specific Tri-Amino Acid Propensity(PSTAAP) were employed to extract the sequence information. To remove information redundancy, information gain (IG) was applied to evaluate the importance of amino acids, which is the information entropy of class after subtracting the conditional entropy for the given amino acid. The prediction performance of the SNO sites was found to be best by using the cross-validation and independent tests. In addition, we also calculated four commonly used performance measurements, i.e. Sensitivity (Sn), Specificity (Sp), Accuracy (Acc), and the Matthew’s Correlation Coefficient (MCC). For the training dataset, the overall Acc was 83.11%, the MCC was 0.6617. For an independent test dataset, Acc was 73.17%, and MCC was 0.3788. The results indicate that our method is likely to complement the existing prediction methods and is a useful tool for effective identification of the SNO sites.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.