Both the problem of class imbalance in datasets and parameter selection of Support Vector Machine (SVM) are crucial to predict software defects. However, there is no one working to solve these problems synchronously at present. To tackle this problem, a hybrid multi-objective cuckoo search under-sampled software defect prediction model based on SVM (HMOCS-US-SVM) is proposed to solve synchronously above two problems. Firstly, a hybrid multi-objective cuckoo search with dynamical local search (HMOCS) is utilized to select synchronously the non-defective sampling and optimize the parameters of SVM. Then, three under-sampled methods for decision region range are proposed to select the non-defective modules. In the simulation, the three indicators, including the false positive rate (pf), the probability of detection (pd), and G-mean, are employed to measure the performance of the proposed algorithm. In addition, eight datasets from Promise database are selected to verify the proposed software defect predication model.Comparing with the result of eight prediction models, the proposed method comes into effect on solving software defect prediction problem. KEYWORDSclass imbalance, hybrid multi-objective cuckoo search, software defect prediction, SVM, under-sampled INTRODUCTIONWith the advancement of network society, the software has been applied widely in the areas of life, such as the banking systems, biopharmaceutical engineering, and traffic signal command. Therefore, an increasing number of attention has been paid to the quality of software products. 1Generally speaking, software quality mainly includes five aspects: reliability, understandability, availability, maintainability, and effectiveness. 2 It is specially said that the reliability plays an important factor in leading to the software defects. 3Software defects are the errors in the software development, which will lead to faults, failure, collapse, and even endanger the safety of human life and property. 4 Therefore, how to find defects as much as possible is particularly important. The core of software defect prediction (SDP) 5 is to extract the characteristic attributes as the obvious defect tendency of the historical software module, so as to predict the type or number of defects in the new software projects.Class imbalance (CIB) in datasets is an unavoidable problem in SDP, which shows that 80% of the defects are concentrated on 20% of the modules. 6 However, the traditional classification algorithm 7 is built on the relative balance of datasets, which not suitable for imbalanced datasets. It does mean that the classification algorithm is more inclined to the non-defected module. 8 Therefore, how to alleviate the imbalance of datasets is a major problem in SDP. To tackle the CIB problem, the existing research can be roughly divided into cost-sensitive method, 9 ensemble method, 10 and sampling method. 11• Cost-sensitive algorithms 12 solve the imbalanced problems by modifying algorithms, which means that the method improves the accuracy of classificatio...
Protein-protein interactions (PPIs) are useful for understanding signaling cascades, predicting protein function, associating proteins with disease and fathoming drug mechanism of action. Currently, only ∼ 10% of human PPIs may be known, and about one-third of human proteins have no known interactions. We introduce FpClass, a data mining-based method for proteome-wide PPI prediction. At an estimated false discovery rate of 60%, we predicted 250,498 PPIs among 10,531 human proteins; 10,647 PPIs involved 1,089 proteins without known interactions. We experimentally tested 233 high- and medium-confidence predictions and validated 137 interactions, including seven novel putative interactors of the tumor suppressor p53. Compared to previous PPI prediction methods, FpClass achieved better agreement with experimentally detected PPIs. We provide an online database of annotated PPI predictions (http://ophid.utoronto.ca/fpclass/) and the prediction software (http://www.cs.utoronto.ca/~juris/data/fpclass/).
Our data suggest that AR may provide another specific definition of breast cancer subtypes and reveal a potential role in DCIS progression. These findings may help develop new therapies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.