Ontology-Enhanced Association Mining

Svátek, Vojtěch; Rauch, Jan; Ralbovský, Martin

doi:10.1007/11908678_11

Cited by 23 publications

(12 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Another approach that uses ontologies in rule mining is the 4ft-Miner tool [192]. The tool is used in four stages of the KDD process: data understanding, data mining, result interpretation and result dissemination.…”

Section: Discussionmentioning

confidence: 99%

Semantic Web in data mining and knowledge discovery: A comprehensive survey

Ristoski

Paulheim

2016

Journal of Web Semantics

244

View full text Add to dashboard Cite

a b s t r a c tData Mining and Knowledge Discovery in Databases (KDD) is a research field concerned with deriving higher-level insights from data. The tasks performed in that field are knowledge intensive and can often benefit from using additional knowledge from various sources. Therefore, many approaches have been proposed in this area that combine Semantic Web data with the data mining and knowledge discovery process. This survey article gives a comprehensive overview of those approaches in different stages of the knowledge discovery process. As an example, we show how Linked Open Data can be used at various stages for building content-based recommender systems. The survey shows that, while there are numerous interesting research works performed, the full potential of the Semantic Web and Linked Open Data for data mining and KDD is still to be unlocked.

show abstract

Section: Discussionmentioning

confidence: 99%

Semantic Web in data mining and knowledge discovery: A comprehensive survey

Ristoski

Paulheim

2016

Journal of Web Semantics

244

View full text Add to dashboard Cite

show abstract

“…In the early work, Svatek and Rauch [71] designed association mining tool that can benefit from ontologies in all four stages of the mining process: data understanding, task design, result interpretation, and result dissemination over the Semantic Web. Bellandi et al [9] presented an ontology-based association rule mining method, which queries the ontology to filter the instances used in the association rule mining process.…”

Section: A Ontology-based Association Rule Miningmentioning

confidence: 99%

Semantic data mining: A survey of ontology-based approaches

Dou

Wang

Liu

2015

Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)

146

View full text Add to dashboard Cite

Semantic Data Mining refers to the data mining tasks that systematically incorporate domain knowledge, especially for mal semantics, into the process. In the past, many research efforts have attested the benefits of incorporating domain knowledge in data mining. At the same time, the proliferation of knowledge engineering has enriched the family of domain knowledge, espe cially formal semantics and Semantic Web ontologies. Ontology is an explicit specification of conceptualization and a formal way to define the semantics of knowledge and data. The formal structure of ontology makes it a nature way to encode domain knowledgefor the data mining use. In this survey paper, we introduce general concepts of semantic data mining. We investigate why ontology has the potential to help semantic data mining and how formal semantics in ontologies can be incorporated into the data mining process. We provide detail discussions for the advances and state of art of ontology-based approaches and an introduction of approaches that are based on other form of

show abstract

“…There are several approaches to do it. One of them is based on using ontologies in applications of the 4ft-Miner procedure [106]. An other approach is based on storing relevant background knowledge in the special part of the LISp-Miner system, this part is called LISp-Miner Knowledge Base [100].…”

Section: Research Projects Related To Lisp-minermentioning

confidence: 99%

The GUHA method and its meaning for data mining

Hájek

Holeňa

Rauch

2010

Journal of Computer and System Sciences

Self Cite

View full text Add to dashboard Cite

a r t i c l e i n f o a b s t r a c tThe paper presents the history and present state of the GUHA method, its theoretical foundations and its relation and meaning for data mining. A survey of development of the GUHA method"GUHA" is the acronym for General Unary Hypotheses Automaton. The idea of the method is: given data, let the computer generate all (or as much as possible) interesting hypotheses of a given logical form that are supported by the data. This idea was elaborated by M. Chytil and P. Hájek in mid-sixties of the last century, the first paper in English being [16]. The approach was as follows: Data to be processed form a rectangular matrix of zeros and ones, rows corresponding to objects and columns to attributes (properties). Let P 1 , . . . , P n be names of the attributes. For each attribute P i , ¬P i is the name of its negation. An elementary conjunction of length k (1 k n) is a conjunction of k literals in which each predicate occurs at most once, e.g. ¬P 3 , P 1 & ¬P 3 & P 7 ; similarly an elementary disjunction (e.g. P 1 ∨ ¬P 3 ∨ P 7 ). An object satisfies an elementary conjunction if it satisfies all its members; it satisfies an elementary disjunction if it satisfies at least one of its members.Let 0 p 1. A formula A ⇒ p S where A is an elementary conjunction (antecedent) and S is an elementary disjunction (succedent) is true in the data if at least 100p percent of objects satisfying A satisfies S, i.e. a/r p where r is the number of objects satisfying A and a is the number of objects satisfying both A and S. The antecedent A is t-good (where t is a natural number) if at least t objects satisfy it. The version of GUHA described in [16] systematically generates "strongest" true formulas A ⇒ p S with a t-good antecedent, notation: A ⇒ p,t S. (Details omitted; "strongest" refers to a notion of a logical rule of immediate consequence among formulas of our form.) See also [2].The reader easily recognizes similarity with the notion of an "associational rule with support and confidence" introduced by Agrawal [1] about 25 years later: his A and S are elementary conjunctions containing no negation, p is the confidence and support is t/m, where m is the number of all objects in the data. 35 The formulas found by GUHA (i.e. by a computer program implementing it) have the form "almost all objects satisfying the antecedent satisfy the succedent (and the number of objects satisfying the antecedent is not too small)." It is stressed that the found results are formulas true in the data and they are hypotheses from the point of view of a universe from which the data are a sample. The slogan has been "GUHA offers everything interesting" (all hypotheses of the given form true in the data). The first implementation (by I. Havel) worked on a computer MINSK22.In 1968 Hájek (in a paper in Czech) suggested a different version based on the statistical Fisher test. Given A and S (now two elementary conjunctions with no predicates in common), let a, b, c, d be the numbers of objects satisfying A & B,A & ¬B, ¬A & B and ¬...

show abstract

Ontology-Enhanced Association Mining

Cited by 23 publications

References 10 publications

Semantic Web in data mining and knowledge discovery: A comprehensive survey

Semantic Web in data mining and knowledge discovery: A comprehensive survey

Semantic data mining: A survey of ontology-based approaches

The GUHA method and its meaning for data mining

Contact Info

Product

Resources

About