2016
DOI: 10.1007/978-3-319-46523-4_30
|View full text |Cite
|
Sign up to set email alerts
|

RDF2Vec: RDF Graph Embeddings for Data Mining

Abstract: Abstract. Linked Open Data has been recognized as a valuable source for background information in data mining. However, most data mining tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs. We generate sequences by… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
280
0

Year Published

2017
2017
2018
2018

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 308 publications
(280 citation statements)
references
References 25 publications
(32 reference statements)
0
280
0
Order By: Relevance
“…Our results in [36] have shown that random walks are a feasible and, in contrast to other techniques such as kernels, also a well scalable approach for extracting sequences.…”
Section: Introductionmentioning
confidence: 91%
See 2 more Smart Citations
“…Our results in [36] have shown that random walks are a feasible and, in contrast to other techniques such as kernels, also a well scalable approach for extracting sequences.…”
Section: Introductionmentioning
confidence: 91%
“…We will introduce both brie y. A more elaborated discussion can be found from the original RDF2Vec paper [36].…”
Section: Preliminariesmentioning
confidence: 99%
See 1 more Smart Citation
“…RDF2Vec [29] is a method which generates feature vectors of a given size, and does so efficiently, even for large graphs. This means that, in principle, even when faced with a machine learning problem on the scale of the web, we can reduce the problem to a set of feature vectors of, say, 500 dimensions, after which we can solve the problem on commodity hardware.…”
Section: Rdf2vecmentioning
confidence: 99%
“…The corpora that the algorithms are trained on can contain either natural language text (e.g. Wikipedia or newswire articles) or artificiallygenerated pseudo corpora, such as the output of the Random Walk on Graphs algorithm, when run to select sequences of nodes from a knowledge graph (KG) -see (Goikoetxea et al, 2015) and (Ristoski and Paulheim, 2016). We denote the pseudo corpus generated via Random Walk on Graphs algorithm as Pseudo Corpus RWG.…”
Section: Introductionmentioning
confidence: 99%