Peter A. Flach scite author profile

Abstract. As most 'real-world' data is structured, research in kernel methods has begun investigating kernels for various kinds of structured data. One of the most widely used tools for modeling structured data are graphs. An interesting and important challenge is thus to investigate kernels on instances that are represented by graphs. So far, only very specific graphs such as trees and strings have been considered. This paper investigates kernels on labeled directed graphs with general structure. It is shown that computing a strictly positive definite graph kernel is at least as hard as solving the graph isomorphism problem. It is also shown that computing an inner product in a feature space indexed by all possible graphs, where each feature counts the number of subgraphs isomorphic to that graph, is NP-hard. On the other hand, inner products in an alternative feature space, based on walks in the graph, can be computed in polynomial time. Such kernels are defined in this paper.

show abstract

Rule Evaluation Measures: A Unifying View

Lavrač

1999

View full text Add to dashboard Cite

ROC ?n? Rule Learning?Towards a Better Understanding of Covering Algorithms

2005

View full text Add to dashboard Cite

Abstract. This paper provides an analysis of the behavior of separate-and-conquer or covering rule learning algorithms by visualizing their evaluation metrics and their dynamics in coverage space, a variant of ROC space. Our results show that most commonly used metrics, including accuracy, weighted relative accuracy, entropy, and Gini index, are equivalent to one of two fundamental prototypes: precision, which tries to optimize the area under the ROC curve for unknown costs, and a cost-weighted difference between covered positive and negative examples, which tries to find the optimal point under known or assumed costs. We also show that a straightforward generalization of the m-estimate trades off these two prototypes. Furthermore, our results show that stopping and filtering criteria like CN2's significance test focus on identifying significant deviations from random classification, which does not necessarily avoid overfitting. We also identify a problem with Foil's MDL-based encoding length restriction, which proves to be largely equivalent to a variable threshold on the recall of the rule. In general, we interpret these results as evidence that, contrary to common conception, pre-pruning heuristics are not very well understood and deserve more investigation.

show abstract

Bridging e-Health and the Internet of Things: The SPHERE Project

et al. 2015

View full text Add to dashboard Cite

Propositionalization Approaches to Relational Data Mining

2001

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Peter A. Flach

On Graph Kernels: Hardness Results and Efficient Alternatives

Rule Evaluation Measures: A Unifying View

ROC ?n? Rule Learning?Towards a Better Understanding of Covering Algorithms

Bridging e-Health and the Internet of Things: The SPHERE Project

Propositionalization Approaches to Relational Data Mining

Contact Info

Product

Resources

About