2013
DOI: 10.1016/j.websem.2013.05.005
|View full text |Cite
|
Sign up to set email alerts
|

Repeatable and reliable semantic search evaluation

Abstract: An increasing amount of structured data on the Web has attracted industry attention and renewed research interest in what is collectively referred to as semantic search. These solutions exploit the explicit semantics captured in structured data such as RDF for enhancing document representation and retrieval, or for finding answers by directly searching over the data. These data have been used for different tasks and a wide range of corresponding semantic search solutions have been proposed in the past. However… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
17
0

Year Published

2015
2015
2018
2018

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 28 publications
(17 citation statements)
references
References 42 publications
0
17
0
Order By: Relevance
“…On one hand, Alonso and Mizzaro (2009) showed that crowdsourcing was a reliable way of providing relevance assessments, the same conclusion of a more recent study by Carvalho et al (2011). On the other hand, Clough et al (2012) and Blanco et al (2013) showed that, while crowdsourced assessments and expert-judges' assessments produce similar rankings of evaluated systems, they do not produce the same assessment scores. Blanco et al (2013) found that, in contrast to experts who are pessimistic in their scoring, non-expert judges accept more items as relevant.…”
Section: Repeatability and Reliabilitymentioning
confidence: 77%
See 4 more Smart Citations
“…On one hand, Alonso and Mizzaro (2009) showed that crowdsourcing was a reliable way of providing relevance assessments, the same conclusion of a more recent study by Carvalho et al (2011). On the other hand, Clough et al (2012) and Blanco et al (2013) showed that, while crowdsourced assessments and expert-judges' assessments produce similar rankings of evaluated systems, they do not produce the same assessment scores. Blanco et al (2013) found that, in contrast to experts who are pessimistic in their scoring, non-expert judges accept more items as relevant.…”
Section: Repeatability and Reliabilitymentioning
confidence: 77%
“…It is indeed important to understand how this factor affects the reliability of an evaluation's results since it has been acknowledged in literature that the more knowledge and familiarity the judges have with the subject area, the less leniency they have for accepting documents as relevant (Rees and Schultz, 1967;Cuadra, 1967;Katter, 1968). Interestingly, Blanco et al (2013) analysed the impact of this factor on the reliability of the SemSearch evaluations and concluded that 1) experts are more pessimistic in their scoring and thus, accept fewer items as relevant when compared to workers (which agrees with the previous studies) and 2) crowdsourcing judgements, hence, cannot replace expert evaluations.…”
Section: Relevance Judgementsmentioning
confidence: 99%
See 3 more Smart Citations