Voting Massive Collections of Bayesian Network Classifiers for Data Streams

Bouckaert, Remco R.

doi:10.1007/11941439_28

Cited by 27 publications

(33 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Generally, all of the sequential and structural features are integrated to construct comprehensive feature representations of query proteins. For prediction engine construction, they build an ensemble classifier model, which fuses five basic classifier models (RF [30], NB [31], Bayes Net [32], LibSVM [33], and SMO (Sequential Minimal Optimization) [34]) with an average probability strategy. Importantly, an online webserver that implements the PFPA method is developed and freely available at .…”

Section: Recent Representative Methods For Protein Fold Recognitionmentioning

confidence: 99%

Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition

Wei

Zou

2016

IJMS

View full text Add to dashboard Cite

Knowledge on protein folding has a profound impact on understanding the heterogeneity and molecular function of proteins, further facilitating drug design. Predicting the 3D structure (fold) of a protein is a key problem in molecular biology. Determination of the fold of a protein mainly relies on molecular experimental methods. With the development of next-generation sequencing techniques, the discovery of new protein sequences has been rapidly increasing. With such a great number of proteins, the use of experimental techniques to determine protein folding is extremely difficult because these techniques are time consuming and expensive. Thus, developing computational prediction methods that can automatically, rapidly, and accurately classify unknown protein sequences into specific fold categories is urgently needed. Computational recognition of protein folds has been a recent research hotspot in bioinformatics and computational biology. Many computational efforts have been made, generating a variety of computational prediction methods. In this review, we conduct a comprehensive survey of recent computational methods, especially machine learning-based methods, for protein fold recognition. This review is anticipated to assist researchers in their pursuit to systematically understand the computational recognition of protein folds.

show abstract

Section: Recent Representative Methods For Protein Fold Recognitionmentioning

confidence: 99%

Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition

Wei

Zou

2016

IJMS

View full text Add to dashboard Cite

show abstract

“…The conditional probability table for the values of the variables indicate each possible combination of the values of its parent nodes [22]. Training process of Bayesian networks consists of two stages, namely learning a network structure and learning the probability tables [23]. There are several different ways of structure learning, such as local score metrics, conditional independence tests, global score metrics and fixed structure.…”

Section: Classification Algorithmsmentioning

confidence: 99%

“…There are several different ways of structure learning, such as local score metrics, conditional independence tests, global score metrics and fixed structure. Based on these ways, a number of search algorithms, such as hill climbing, simulated annealing and tabu search are implemented in Weka [23].…”

Section: Classification Algorithmsmentioning

confidence: 99%

On the Performance of Ensemble Learning for Automated Diagnosis of Breast Cancer

Onan

2015

Advances in Intelligent Systems and Computing

View full text Add to dashboard Cite

Abstract. The automated diagnosis of diseases with high accuracy rate is one of the most crucial problems in medical informatics. Machine learning algorithms are widely utilized for automatic detection of illnesses. Breast cancer is one of the most common cancer types in females and the second most common cause of death from cancer in females. Hence, developing an efficient classifier for automated diagnosis of breast cancer is essential to improve the chance of diagnosing the disease at the earlier stages and treating it more properly. Ensemble learning is a branch of machine learning that seeks to use multiple learning algorithms so that better predictive performance acquired. Ensemble learning is a promising field for improving the performance of base classifiers. This paper is concerned with the comparative assessment of the performance of six popular ensemble methods (Bagging, Dagging, Ada Boost, Multi Boost, Decorate, and Random Subspace) based on fourteen base learners (Bayes Net, FURIA, Knearest Neighbors, C4.5, RIPPER, Kernel Logistic Regression, K-star, Logistic Regression, Multilayer Perceptron, Naïve Bayes, Random Forest, Simple Cart, Support Vector Machine, and LMT) for automatic detection of breast cancer. The empirical results indicate that ensemble learning can improve the predictive performance of base learners on medical domain. The best results for comparative experiments are acquired with Random Subspace ensemble method. The experiments show that ensemble learning methods are appropriate methods to improve the performance of classifiers for medical diagnosis.

show abstract

“…Moreover, the SPODE model (e.g., AODE) has already been improved for the highly scalable attribute problem in [10]. Also, SPODE models can be trained incrementally [5]. When facing huge word histograms, a hashing correlated feature approach in [4] has been proposed to rank the features.…”

Section: Text Categorization Taskmentioning

confidence: 99%

SODE: Self-Adaptive One-Dependence Estimators for classification

Pan

Zhu

et al. 2016

Pattern Recognition

View full text Add to dashboard Cite

SuperParent-One-Dependence Estimators (SPODEs) represent a family of semi-naive Bayesian classifiers which relax the attribute independence assumption of Naive Bayes (NB) to allow each attribute to depend on a common single attribute (superparent). SPODEs can effectively handle data with attribute dependency but still inherent NB's key advantages such as computational efficiency and robustness for high dimensional data. In reality, determining an optimal superparent for SPODEs is difficult. One common approach is to use weighted combinations of multiple SPODEs, each having a different superparent, by assigning a proper weight value to each superparent (i.e., an attribute). In this paper, we propose a self-adaptive SPODEs, namely SODE, which uses immunity theory in artificial immune systems to automatically and self-adaptively select the weight for each single SPODE. SODE does not need to know the importance of individual SPODE nor the relevance among SPODEs, so it can flexibly and efficiently search optimal weight values for each SPODE during the learning process. Extensive experiments and comparisons on 56 benchmark data sets, and validations on image retrieval and document categorization demonstrate that SODE is suitable for a wide range of tasks and outperforms other state-of-the-art weighted SPODE algorithms. Results also confirm that SODE provides an appropriate balance between runtime efficiency and accuracy effectiveness.

show abstract

Voting Massive Collections of Bayesian Network Classifiers for Data Streams

Cited by 27 publications

References 4 publications

Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition

Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition

On the Performance of Ensemble Learning for Automated Diagnosis of Breast Cancer

SODE: Self-Adaptive One-Dependence Estimators for classification

Contact Info

Product

Resources

About