1BC: A First-Order Bayesian Classifier

Abstract. In this paper we propose an extension of the naïve Bayes classification method to the multi-relational setting. In this setting, training data are stored in several tables related by foreign key constraints and each example is represented by a set of related tuples rather than a single row as in the classical data mining setting. This work is characterized by three aspects. First, an integrated approach in the computation of the posterior probabilities for each class that make use of first order classification rules. Second, the applicability to both discrete and continuous attributes by means a supervised discretization. Third, the consideration of knowledge on the data model embedded in the database schema during the generation of classification rules. The proposed method has been implemented in the new system Mr-SBC, which is tightly integrated with a relational DBMS. Testing has been performed on two datasets and four benchmark tasks. Results on predictive accuracy and efficiency are in favour of Mr-SBC for the most complex tasks.

show abstract

“…Results for Progol_1 are taken from [22]. The results for 1BC are taken from [9]. Results for 1BC2 are taken from [16].…”

Section: Results On Mutagenesismentioning

confidence: 99%

Mr-SBC: A Multi-relational Naïve Bayes Classifier

Ceci

Appice

Malerba

2003

Knowledge Discovery in Databases: PKDD 2003

View full text Add to dashboard Cite

show abstract

“…While there are other relational learning algorithms available [7,9,6], we focus in this paper on the three named algorithms.…”

Section: Motivationmentioning

confidence: 99%

“…9 It consists of a set of web pages from four computer science departments, with each page manually labeled into the categories: course, department, faculty, person, project, staff, student or other. This data set includes clearly defined link-to relations between pages.…”

Section: Webkbmentioning

confidence: 99%

A Simple Relational Classifier

Macskassy¹,

Provost²

2003

196

214

View full text Add to dashboard Cite

Abstract. We analyze a Relational Neighbor (RN) classifier, a simple relational predictive model that predicts only based on class labels of related neighbors, using no learning and no inherent attributes. We show that it performs surprisingly well by comparing it to more complex models such as Probabilistic Relational Models and Relational Probability Trees on three data sets from published work. We argue that a simple model such as this should be used as a baseline to assess the performance of relational learners. MotivationIn recent years, we have seen remarkable advances in algorithms for relational learning, especially statistically based algorithms. These algorithms have been developed in a wide variety of different research fields and problem settings. Relational data differ from traditional data in that they violate the instance-independence assumption. Instances can be related, or linked, in various ways. The label of an instance might depend on the instances it is related to either directly or through arbitrarily long chains of relations. This relational structure further complicates matters as it makes it harder, if not impossible, to separate the data cleanly into test and train sets without losing much relational information. Recent work has begun to investigate foundational issues within relational learning, such as the dimensions across which learners can be compared [11,14,25] as well as issues of link dependencies [13]. We broaden these investigations by describing a baseline method to which relational learners should be compared when assessing how well they have extracted a useful model from the given relational structure-beyond what can be achieved by looking only at known class labels of related neighbors.Recent probabilistic relational learning algorithms-e.g., Probabilistic Relational Models (PRMs) [16,10,27], Relational Probability Trees (RPTs) [22] and Relational Bayesian Classifiers (RBCs) [23]-search the relational space for useful attributes and relational structure of neighbors (possibly more than one link away). While there are other relational learning algorithms available [7,9,6], we focus in this paper on the three named algorithms.We know from classical machine learning that even very simple statistical methods such as naive Bayes can perform remarkably well even when compared to more complex methods. However, a question that has yet to receive much attention is how much of the performance of relational learners is due to their complexity and how much can

show abstract

“…The main contributions of this paper concern the transfer of this methodology to the multi-relational learning setting. The contributions include substantial improvements of the propositionalization step (compared to the propositionalization proposed by Flach and Lachiche (1999) and Lavrač and Flach (2001)) and an effective implementation of relational subgroup discovery algorithm RSD, employing language and evaluation constraints. Further contributions concern the analysis of the RSD subgroup discovery algorithm in the ROC space, and the successful application of RSD to standard ILP problems (East-West trains, King-Rook-King chess endgame and mutagenicity prediction) and two real-life problem domains (analysis of telephone calls and analysis of traffic accidents).…”

Section: Are As Large As Possible and Have The Most Unusual Statisticmentioning

confidence: 99%

“…It involves the construction of features from relational background knowledge and structural properties of individuals. The features have the form of Prolog queries, consisting of structural predicates, which refer to parts (substructures) of individuals and introduce new existential variables, and of utility predicates as in LINUS (Lavrač & Džeroski, 1994), called properties in Flach and Lachiche (1999), that 'consume' all the variables by assigning properties to individuals or their parts, represented by variables introduced so far. Utility predicates do not introduce new variables.…”

Section: Related Propositionalization Approachesmentioning

confidence: 99%

Propositionalization-based relational subgroup discovery with RSD

Železný

Lavrač

2006

Mach Learn

View full text Add to dashboard Cite

Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction. The proposed approach was successfully applied to standard ILP problems (East-West trains, King-Rook-King chess endgame and mutagenicity prediction) and two real-life problems (analysis of telephone calls and traffic accident analysis).

show abstract

1BC: A First-Order Bayesian Classifier

Cited by 63 publications

References 9 publications

Mr-SBC: A Multi-relational Naïve Bayes Classifier

Mr-SBC: A Multi-relational Naïve Bayes Classifier

A Simple Relational Classifier

Propositionalization-based relational subgroup discovery with RSD

Contact Info

Product

Resources

About