2018
DOI: 10.1186/s12874-018-0606-7
|View full text |Cite
|
Sign up to set email alerts
|

Reliability in evaluator-based tests: using simulation-constructed models to determine contextually relevant agreement thresholds

Abstract: BackgroundIndices of inter-evaluator reliability are used in many fields such as computational linguistics, psychology, and medical science; however, the interpretation of resulting values and determination of appropriate thresholds lack context and are often guided only by arbitrary “rules of thumb” or simply not addressed at all. Our goal for this work was to develop a method for determining the relationship between inter-evaluator agreement and error to facilitate meaningful interpretation of values, thresh… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 24 publications
(9 citation statements)
references
References 25 publications
1
8
0
Order By: Relevance
“…We computed the inter-coder agreement using Krippendorff's α (Artstein and Poesio, 2008;Hayes and Krippendorff, 2007). The inter-coder agreement for our study is 0.87 which concludes that our qualitative analysis is reliable as prior studies use α 0.8 as an indicator of reliable agreement (Li et al, 2020;Beckler et al, 2018;Webb et al, 2020;Vassallo et al, 2020;Scoccia and Autili, 2020).…”
Section: Approachsupporting
confidence: 75%
“…We computed the inter-coder agreement using Krippendorff's α (Artstein and Poesio, 2008;Hayes and Krippendorff, 2007). The inter-coder agreement for our study is 0.87 which concludes that our qualitative analysis is reliable as prior studies use α 0.8 as an indicator of reliable agreement (Li et al, 2020;Beckler et al, 2018;Webb et al, 2020;Vassallo et al, 2020;Scoccia and Autili, 2020).…”
Section: Approachsupporting
confidence: 75%
“…Organisms instinctively make strategic decisions about how to spend their resources in relationship to the payoff from their actions. Humans are subject to these same instincts (72)(73)(74)(75), and PEP provides insight into resource allocation decisions made during sensory discrimination tasks (49,50). With TMR-motor only, pSD made discrimination decisions slightly above chance.…”
Section: Discussionmentioning
confidence: 99%
“…The metrics focus on essential sensory-motor features of limb function such as visual attention, cognitive demand, fine motor dexterity, and ownership while also being clinically and real-world relevant. Each metric was validated in their foundational fields of psychophysics, mathematical theory, cognition/perception, visuomotor behavior, and psychometrics (23,31,(37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50)(51)(52)(53)(54).…”
Section: Introductionmentioning
confidence: 99%
“…Krippendorff's alpha of the polarity of the PPPRs and the similarities between pre-publication peer reviews and PPPRs were calculated using a software developed by Freelon (2013), and the alpha values were 0.887 and 0.941. Both alpha values were greater than 0.8, which demonstrated acceptable interrater reliability between the judgments of the two coders (Beckler et al , 2018).…”
Section: Resultsmentioning
confidence: 87%