Multidimensional Bayesian network classifiers have gained popularity over the last few years due to their expressive power and their intuitive graphical representation. A drawback of this approach is that their use to perform multidimensional classification, a generalization of multi-label classification, can be very computationally demanding when there are a large number of class variables. Thus, a key challenge in this field is to ensure the tractability of these models during the learning process. In this paper, we show how information about the most common queries of multidimen-sional Bayesian network classifiers affects the complexity of these models. We provide upper bounds for the complexity of the most probable explanations and marginals of class variables conditioned to an instantiation of all feature variables. We use these bounds to propose efficient strategies for bounding the complexity of multidimensional Bayesian network classifiers during the learning process, and provide a simple learning method with an order-based search that guarantees the tractability of the returned models. Experimental results show that our approach is competitive with other methods in the state of the art and also ensures the tractability of the learned models.
Objective Drug‐resistant temporal lobe epilepsy (TLE) is the most common type of epilepsy for which patients undergo surgery. Despite the best clinical judgment and currently available prediction algorithms, surgical outcomes remain variable. We aimed to build and to evaluate the performance of multidimensional Bayesian network classifiers (MBCs), a type of probabilistic graphical model, at predicting probability of seizure freedom after TLE surgery. Methods Clinical, neurophysiological, and imaging variables were collected from 231 TLE patients who underwent surgery at the University of California, San Francisco (UCSF) or the Montreal Neurological Institute (MNI) over a 15‐year period. Postsurgical Engel outcomes at year 1 (Y1), Y2, and Y5 were analyzed as primary end points. We trained an MBC model on combined data sets from both institutions. Bootstrap bias corrected cross‐validation (BBC‐CV) was used to evaluate the performance of the models. Results The MBC was compared with logistic regression and Cox proportional hazards according to the area under the receiver‐operating characteristic curve (AUC). The MBC achieved an AUC of 0.67 at Y1, 0.72 at Y2, and 0.67 at Y5, which indicates modest performance yet superior to what has been reported in the state‐of‐the‐art studies to date. Significance The MBC can more precisely encode probabilistic relationships between predictors and class variables (Engel outcomes), achieving promising experimental results compared to other well‐known statistical methods. Multisite application of the MBC could further optimize its classification accuracy with prospective data sets. Online access to the MBC is provided, paving the way for its use as an adjunct clinical tool in aiding pre‐operative TLE surgical counseling.
The computational complexity of inference is now one of the most relevant topics in the field of Bayesian networks. Although the literature contains approaches that learn Bayesian networks from high dimensional datasets, traditional methods do not bound the inference complexity of the learned models, often producing models where exact inference is intractable. This paper focuses on learning tractable Bayesian networks from data. To address this problem, we propose strategies for learning Bayesian networks in the space of elimination orders. In this manner, we can efficiently bound the inference complexity of the networks during the learning process. Searching in the combined space of directed acyclic graphs and elimination orders can be extremely computationally demanding. We demonstrate that one type of elimination trees, which we define as valid, can be used as an equivalence class of directed acyclic graphs and elimination orders, removing redundancy. We propose methods for incrementally compiling local changes made to directed acyclic graphs in elimination trees and for searching for elimination trees of low width. Using these methods, we can move through the space of valid elimination trees in polynomial time with respect to the number of network variables and in linear time with respect to treewidth.
The majority of real-world problems require addressing incomplete data. The use of the structural expectation-maximization algorithm is the most common approach toward learning Bayesian networks from incomplete datasets. However, its main limitation is its demanding computational cost, caused mainly by the need to make an inference at each iteration of the algorithm. In this paper, we propose a new method with the purpose of guaranteeing the efficiency of the learning process while improving the performance of the structural expectation-maximization algorithm. We address the first objective by applying an upper bound to the treewidth of the models to limit the complexity of the inference. To achieve this, we use an efficient heuristic to search the space of the elimination orders. For the second objective, we study the advantages of directly computing the score with respect to the observed data rather than an expectation of the score, and provide a strategy to efficiently perform these computations in the proposed method. We perform exhaustive experiments on synthetic and real-world datasets of varied dimensionalities, including datasets with thousands of variables and hundreds of thousands of instances. The experimental results support our claims empirically.
One of the main research topics in machine learning nowadays is the improvement of the inference and learning processes in probabilistic graphical models. Traditionally, inference and learning have been treated separately, but given that the structure of the model conditions the inference complexity, most learning methods will sometimes produce inefficient inference models. In this paper we propose a framework for learning low inference complexity Bayesian networks. For that, we use a representation of the network factorization that allows efficiently evaluating an upper bound in the inference complexity of each model during the learning process. Experimental results show that the proposed methods obtain tractable models that improve the accuracy of the predictions provided by approximate inference in models obtained with a well-known Bayesian network learner.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.