Our microPred classifier yielded higher and, especially, much more reliable classification results in terms of both sensitivity (90.02%) and specificity (97.28%) than the exiting pre-miRNA classification methods. When validated with 6095 non-human animal pre-miRNAs and 139 virus pre-miRNAs from miRBase, microPred resulted in 92.71% (5651/6095) and 94.24% (131/139) recognition rates, respectively.
Quantum-behaved particle swarm optimization (QPSO), motivated by concepts from quantum mechanics and particle swarm optimization (PSO), is a probabilistic optimization algorithm belonging to the bare-bones PSO family. Although it has been shown to perform well in finding the optimal solutions for many optimization problems, there has so far been little analysis on how it works in detail. This paper presents a comprehensive analysis of the QPSO algorithm. In the theoretical analysis, we analyze the behavior of a single particle in QPSO in terms of probability measure. Since the particle's behavior is influenced by the contraction-expansion (CE) coefficient, which is the most important parameter of the algorithm, the goal of the theoretical analysis is to find out the upper bound of the CE coefficient, within which the value of the CE coefficient selected can guarantee the convergence or boundedness of the particle's position. In the experimental analysis, the theoretical results are first validated by stochastic simulations for the particle's behavior. Then, based on the derived upper bound of the CE coefficient, we perform empirical studies on a suite of well-known benchmark functions to show how to control and select the value of the CE coefficient, in order to obtain generally good algorithmic performance in real world applications. Finally, a further performance comparison between QPSO and other variants of PSO on the benchmarks is made to show the efficiency of the QPSO algorithm with the proposed parameter control and selection methods.
Support vector machines (SVMs) is a popular machine learning technique, which works effectively with balanced datasets. However, when it comes to imbalanced datasets, SVMs produce suboptimal classification models. On the other hand, the SVM algorithm is sensitive to outliers and noise present in the datasets. Therefore, although the existing class imbalance learning (CIL) methods can make SVMs less sensitive to class imbalance, they can still suffer from the problem of outliers and noise. Fuzzy SVMs (FSVMs) is a variant of the SVM algorithm, which has been proposed to handle the problem of outliers and noise. In FSVMs, training examples are assigned different fuzzy-membership values based on their importance, and these membership values are incorporated into the SVM learning algorithm to make it less sensitive to outliers and noise. However, like the normal SVM algorithm, FSVMs can also suffer from the problem of class imbalance. In this paper, we present a method to improve FSVMs for CIL (called FSVM-CIL), which can be used to handle the class imbalance problem in the presence of outliers and noise. We thoroughly evaluated the proposed FSVM-CIL method on ten real-world imbalanced datasets and compared its performance with five existing CIL methods, which are available for normal SVM training. Based on the overall results, we can conclude that the proposed FSVM-CIL method is a very effective method for CIL, especially in the presence of outliers and noise in datasets.
Multi-Classifier Systems (MCSs) have fast been gaining popularity among researchers for their ability to fuse together multiple classification outputs for better accuracy and classification. At present, there is a lot of literature covering many of the issues and concerns that MCS designers encounters. However, we found out that there isn't a single paper published thus far which presents an overall picture of the basic principles behind the design of multi-classifier systems. Therefore, this paper presents a current overview of MCSs and provides a road-map for MCS designers. We identify all the key decisions that a designer would have to make over the design of a MCS and list out the most useful options available at each decision making step. We also present a case-study of the MCS theoretical issues considered, and present informal guidelines for the selection of different paradigms, based on the properties and distribution of the data. We also introduce a novel optimization of the standard majority voting combiner which uses a genetic algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.