2017
DOI: 10.31234/osf.io/mwd5g
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Estimating the Reliability of Emotion Measures over Very Short Intervals: The Utility of Within- Session Retest Correlations

Abstract: Short measures are commonly used when conducting research involving emotions. However, obtaining appropriate estimates of reliability for short measures is traditionally problematic and is a reoccurring concern in emotion research. To address this issue, we compare the withinsession test-retest and factor analysis methods for estimating the reliability of items in the PANAS-X. Results indicate that within-session test-retest ( ( ) ) estimates outperform the factor analysis method by demonstrating stronger rela… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…Conduct retest analyses of items with adequate qualitative and quantitative properties Test-retest correlations over short spans are particularly good indicators of item quality: for an item to provide reliable and useful information, raters first have to answer it consistently in the short-run --that is, they have to be able to agree with themselves on the content of the item. The retest interval can be a couple of months (Watson, 2004), a couple of weeks (Mõttus, Sinick et al, 2019;Soto & John, 2017), a couple of days (Wood et al, 2010), or even a couple of minutes (Lowman et al, 2018;Wood et al, 2018). What makes these estimates so valuable is that they are particularly good predictors of many standard indicators of item validity simultaneously, such as self-other agreement correlations and stability correlations over longer time periods (McCrae, Kurtz, Yamagata, Terracciano, 2011;Henry & Mõttus, 2020), while also being estimable for single items.…”
Section: Step 2 Programmatic Evaluation and Documentation Of Item Characteristicsmentioning
confidence: 99%
“…Conduct retest analyses of items with adequate qualitative and quantitative properties Test-retest correlations over short spans are particularly good indicators of item quality: for an item to provide reliable and useful information, raters first have to answer it consistently in the short-run --that is, they have to be able to agree with themselves on the content of the item. The retest interval can be a couple of months (Watson, 2004), a couple of weeks (Mõttus, Sinick et al, 2019;Soto & John, 2017), a couple of days (Wood et al, 2010), or even a couple of minutes (Lowman et al, 2018;Wood et al, 2018). What makes these estimates so valuable is that they are particularly good predictors of many standard indicators of item validity simultaneously, such as self-other agreement correlations and stability correlations over longer time periods (McCrae, Kurtz, Yamagata, Terracciano, 2011;Henry & Mõttus, 2020), while also being estimable for single items.…”
Section: Step 2 Programmatic Evaluation and Documentation Of Item Characteristicsmentioning
confidence: 99%
“…Some may think that item‐level findings are notoriously unreliable. But as was discussed before, items often have retest reliabilities of .65 or higher (Lowman, Wood, Armstrong, Harms, & Watson, 2018; Mõttus et al, 2019; Wood, Nye, & Saucier, 2010; Henry & Mõttus, 2020), which may be higher than many intuitively expect. Higher‐than‐assumed single item reliability is also consistent with findings that items out‐predict scales for outcomes and other variables (Achaa‐Amankwaa, Olaru, & Schroeders, 2020; Elleman, McDougald, Condon, & Revelle, 2020; Mõttus & Rozgonjuk, 2019; Seeboth & Mõttus, 2018; Vainik, Mõttus, Allik, Esko, & Realo, 2015).…”
Section: Descriptive Personality Sciencementioning
confidence: 96%
“…4Retest correlations over shorter testing intervals can be higher still (Lowman, Wood, Armstrong, Harms, & Watson, 2018) and may provide even more accurate reliability estimates.…”
Section: Descriptive Personality Sciencementioning
confidence: 99%
“…The r tt does not rely on the assumption that all items measure nothing but a single unidimensional trait, and it is less distorted by state-like artifacts. Indeed, unlike internal consistency, scales' r tt -s track their validities (Henry et al, 2022;McCrae, 2011), making it the preferred method of estimating reliability (Lowman et al, 2018;McCrae, 2015;Revelle & Condon, 2019;). Besides, it can be calculated for individual test items, allowing researchers to select the most reliable items into their scales.…”
Section: Reliability In Personality Measurementsmentioning
confidence: 99%
“…Besides, it can be calculated for individual test items, allowing researchers to select the most reliable items into their scales. Also, corrections of correlations between scale scores for measurement error that use internal consistencies often result in correlations above 1.00, whereas using r tt rarely results in such off-limit correlations (Lowman et al, 2018).…”
Section: Reliability In Personality Measurementsmentioning
confidence: 99%