Improving test collection pools with machine learning

Jayasinghe, Gaya K.; Webber, William; Sanderson, Mark; Culpepper, J. Shane

doi:10.1145/2682862.2682864

Cited by 8 publications

(2 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In contrast, the automatic benchmark provides the opportunity to study the marginal relevance of the ranking, by the nature of its construction. Another advantage of our approach is that it is less influenced by a particular selection of systems from which the assessment pool is built-a problem pointed out by Jayasinghe et al (2014). Anecdotally, many participants reported that automatic collections are very effective for method development and training, since train and test performance is often nearly identical.…”

Section: Discussionmentioning

confidence: 99%

“…A system of Cormack et al (1998) aids manual assessors in determining the relevance through interactive searching and judging. Jayasinghe et al (2014) suggest a machine learning method to obtain more resilient assessment pools for manual assessment. Yilmaz et al (2008) reduce manual assessment costs by sampling assessment pools randomly from input rankings while preferring highly ranked documents.…”

Section: Automatic Support For Manual Test Collectionsmentioning

confidence: 99%

See 1 more Smart Citation

Humans Optional? Automatic Large-Scale Test Collections for Entity, Passage, and Entity-Passage Retrieval

Dietz

Dalton

2020

Datenbank Spektrum

View full text Add to dashboard Cite

Manually creating test collections is a time-, effort-, and cost-intensive process. This paper describes a fully automatic alternative for deriving large-scale test collections, where no human assessments are needed. The empirical experiments confirm that automatic test collection and manual assessments agree on the best performing systems. The collection includes relevance judgments for both text passages and knowledge base entities. Since test collections with relevance data for both entity and text passages are rare, this approach provides a cost-efficient way for training and evaluating ad hoc passage retrieval, entity retrieval, and entityaware text retrieval methods.

show abstract

Section: Discussionmentioning

confidence: 99%