2021
DOI: 10.48550/arxiv.2107.00156
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Study of the Quality of Wikidata

Abstract: Wikidata has been increasingly adopted by many communities for a wide variety of applications, which demand high-quality knowledge to deliver successful results. In this paper, we develop a framework to detect and analyze low-quality statements in Wikidata by shedding light on the current practices exercised by the community. We explore three indicators of data quality in Wikidata, based on: 1) community consensus on the currently recorded knowledge, assuming that statements that have been removed and not adde… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 18 publications
(26 reference statements)
0
1
0
Order By: Relevance
“…Namely, when an editor introduces a new Qnode, it is useful to have metrics which can detect very similar existing entities and ask the editor to confirm that the new entity is different from the most similar existing ones [1]. This procedure would help to avoid introducing duplicates in Wikidata, which is a key challenge today, considering that millions of redirects have been introduced in Wikidata since its inception [9]. At the same time, similarity methods could be run over the current set of entities in Wikidata to detect potentially existing duplicates, which can be validated by an editor before their merging.…”
Section: Similarity In Downstream Tasksmentioning
confidence: 99%
“…Namely, when an editor introduces a new Qnode, it is useful to have metrics which can detect very similar existing entities and ask the editor to confirm that the new entity is different from the most similar existing ones [1]. This procedure would help to avoid introducing duplicates in Wikidata, which is a key challenge today, considering that millions of redirects have been introduced in Wikidata since its inception [9]. At the same time, similarity methods could be run over the current set of entities in Wikidata to detect potentially existing duplicates, which can be validated by an editor before their merging.…”
Section: Similarity In Downstream Tasksmentioning
confidence: 99%