Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data 2014
DOI: 10.1145/2588555.2610509
|View full text |Cite
|
Sign up to set email alerts
|

Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation

Abstract: In many applications, one can obtain descriptions about the same objects or events from a variety of sources. As a result, this will inevitably lead to data or information conflicts. One important problem is to identify the true information (i.e., the truths) among conflicting sources of data. It is intuitive to trust reliable sources more when deriving the truths, but it is usually unknown which one is more reliable a priori. Moreover, each source possesses a variety of properties with different data types. A… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
289
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 398 publications
(289 citation statements)
references
References 22 publications
0
289
0
Order By: Relevance
“…They are compared on both continuous and categorical data unless otherwise specified. The baseline methods include some state-of-the-art truth discovery methods: GTM [36], TruthFinder [34], AccuSim [8], Investment [23], 3-Estimates [16], and CRH [18]. More detailed summary of these methods can be found in Section 5.…”
Section: Baseline Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…They are compared on both continuous and categorical data unless otherwise specified. The baseline methods include some state-of-the-art truth discovery methods: GTM [36], TruthFinder [34], AccuSim [8], Investment [23], 3-Estimates [16], and CRH [18]. More detailed summary of these methods can be found in Section 5.…”
Section: Baseline Methodsmentioning
confidence: 99%
“…Recently, Dong et al model source selection in the truth discovery tasks based on the idea of "gain" and "cost" [11,27]. Li et al aim to minimize the weighted deviation of claims and truths, so an optimization framework is adopted and applied on heterogeneous data, in which different data types can be modeled jointly [18].…”
Section: Related Workmentioning
confidence: 99%
“…The length of visits, visiting rate, stay points, online searches, check-in data, and photos on social media platforms are attributes analyzed and included in the model based on the pheromone and digital trail analogy. This work provides a new perspective, inspired by research on the behavior of real ants and data fusion [34][35][36][37][38].…”
Section: Recreational Behavior Analysis and Digital Trace Fusionmentioning
confidence: 99%
“…Using a non-dimensional method, the two pheromones in online and offline space are integrated to calculate the aggregate pheromones (M i ) in urban recreational areas. The mathematical Equation (6) shows the normalization, and Equation (7) is the integration with a polynomial expression, inspired by Li et al [37]:…”
Section: Establishing Proper Weights For Online and Offline Data To Bmentioning
confidence: 99%
“…If true, the process diverges into two branches: i) adding the value to the existing group (lines [11][12][13][14] and ii) setting up a new group for the value (lines [15][16][17][18]. In the first case, Algorithm 2 invokes itself, taking the updated solution (with the new value added to the last group) and the updated sublist (with the first value removed) as inputs (line 12).…”
Section: Content-based Groupingmentioning
confidence: 99%