2020
DOI: 10.1007/978-3-030-49461-2_30
|View full text |Cite
|
Sign up to set email alerts
|

SemTab 2019: Resources to Benchmark Tabular Data to Knowledge Graph Matching Systems

Abstract: Tabular data to Knowledge Graph matching is the process of assigning semantic tags from knowledge graphs (e.g., Wikidata or DBpedia) to the elements of a table. This task is a challenging problem for various reasons, including the lack of metadata (e.g., table and column names), the noisiness, heterogeneity, incompleteness and ambiguity in the data. The results of this task provide significant insights about potentially highly valuable tabular data, as recent works have shown, enabling a new family of data ana… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
50
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 50 publications
(50 citation statements)
references
References 17 publications
0
50
0
Order By: Relevance
“…• if the tables with columns sorted by type are annotated in the wrong way, then the entity linking algorithm is constrained by the type inferred looking at the first n rows, with n too small; • if homonyms or nicknames have been wrongly matched, that means that the algorithm employs popularity mechanisms (e.g., page rank), or it is based on a lookup service that returns the most popular entities first (e.g., DBpedia Lookup). 7 Annotating nicknames requires the algorithms to cover aspects of semantics that go a bit beyond simple heuristics; 8 • if the tables with level-1 noise are not properly annotated, then the algorithm cannot deal with real-world noise (that can be trickier than the artificial level-2 noise); • if the annotations are wrong for the tables containing nicknames, it might be the case the algorithm only focuses on the canonical names of the entities. Tables 1 and 2 show statistics for 2T and existing benchmark datasets.…”
Section: The 2t Datasetmentioning
confidence: 99%
See 3 more Smart Citations
“…• if the tables with columns sorted by type are annotated in the wrong way, then the entity linking algorithm is constrained by the type inferred looking at the first n rows, with n too small; • if homonyms or nicknames have been wrongly matched, that means that the algorithm employs popularity mechanisms (e.g., page rank), or it is based on a lookup service that returns the most popular entities first (e.g., DBpedia Lookup). 7 Annotating nicknames requires the algorithms to cover aspects of semantics that go a bit beyond simple heuristics; 8 • if the tables with level-1 noise are not properly annotated, then the algorithm cannot deal with real-world noise (that can be trickier than the artificial level-2 noise); • if the annotations are wrong for the tables containing nicknames, it might be the case the algorithm only focuses on the canonical names of the entities. Tables 1 and 2 show statistics for 2T and existing benchmark datasets.…”
Section: The 2t Datasetmentioning
confidence: 99%
“…It requires the introduction of semantic algorithms, namely semantic table interpreters, that link cells to elements in a KG. Recently, the SemTab 2019 [7] challenge was introduced to unify the community efforts towards the development of performing annotations. The challenge consists of different rounds in which tables of various difficulties have to be annotated.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…There is work on gathering and cleaning datasets of tables, amalgamating information from different tables, and returning tables as a result of keyword queries [1,17,[19][20][21]. Recent literature has introduced benchmarks and shared tasks in table understanding such as annotating columns or table cells with KG data [4,7]. However, relation extraction on has not been studied as much.…”
Section: Related Workmentioning
confidence: 99%