Previous algorithms for the recovery of Bayesian belief network structures from data have been either highly dependent on conditional independence (CI) tests, or have required on ordering on the nodes to be supplied by the user. We present an algorithm that integrates these two approaches: CI tests are used to generate an ordering on the nodes from the database, which is then used to recover the underlying Bayesian network structure using a non-Cl-test-based method. Results of the evaluation of the algorithm on a number of databases (e.g., ALARM, LED, and SOYBEAN) are presented. We also discuss some algorithm performance issues and open problems.
Previous algorithms for the recovery of Bayesian belief network structures from data have been either highly dependent on conditional independence (CI) tests, or have required on ordering on the nodes to be supplied by the user.We present an algorithm that integrates these two approaches: CI tests are used to generate an ordering on the nodes from the database, which is then used to recover the underlying Bayesian network structure using a non-Cl-test-based method. Results of the evaluation of the algorithm on a number of databases (e.g., ALARM, LED, and SOYBEAN) are presented. We also discuss some algorithm performance issues and open problems.
Identifiability of parameters is an essential property for a statistical model to be useful in most settings. However, establishing parameter identifiability for Bayesian networks with hidden variables remains challenging. In the context of finite state spaces, we give algebraic arguments establishing identifiability of some special models on small directed acyclic graphs (DAGs). We also establish that, for fixed state spaces, generic identifiability of parameters depends only on the Markov equivalence class of the DAG. To illustrate the use of these results, we investigate identifiability for all binary Bayesian networks with up to five variables, one of which is hidden and parental to all observable ones. Surprisingly, some of these models have parameterizations that are generically 4-to-one, and not 2-to-one as label swapping of the hidden states would suggest. This leads to interesting conflict in interpreting causal effects.
This paper addresses the problem of identifying causal effects from nonexperimental data in a causal Bayesian network, i.e., a directed acyclic graph that represents causal relationships. The identifiability question asks whether it is possible to compute the probability of some set of (effect) variables given intervention on another set of (intervention) variables, in the presence of nonobservable (i.e., hidden or latent) variables. It is well known that the answer to the question depends on the structure of the causal Bayesian network, the set of observable variables, the set of effect variables, and the set of intervention variables. Sound algorithms for identifiability have been proposed, but no complete algorithm is known. We show that the identify algorithm that Tian and Pearl defined for semi-Markovian models [1,2,3], an important special case of causal Bayesian networks, is both sound and complete. We believe that this result will prove useful to solve the identifiability question for general causal Bayesian networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.