Is the Reliability of Objective Originality Scores Confounded by Elaboration?

Maio, Shannon; Dumas, Denis; Organisciak, Peter; Runco, Mark A.

doi:10.1080/10400419.2020.1818492

Cited by 18 publications

(13 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…More elaborative responses are likely to contain more component ideas leading to higher creativity ratings. The positive relationship with elaboration has been empirically observed in human creativity ratings (Beaty & Johnson, 2021;Forthmann et al, 2018;Maio et al, 2020;Runco et al, 2010).…”

Section: Elaboration and Its Role In Creative Quality Assessmentmentioning

confidence: 75%

“…Interestingly, different semantic distance-based automatic methods have different sensitivity to elaboration. Forthmann et al (2019) found additive composition models produce scores that are negatively correlated with elaboration (Dumas et al, 2021; Forster & Dunbar, 2009; Forthmann, Holling, Zandi, et al, 2017; Maio et al, 2020), and recommended measures to correct the “elaboration-bias.” On the other hand, Beaty and Johnson (2021) found that EWM-produced scores positively correlated with elaboration. Scores from the MAD method are expected to increase with elaboration because maximum is used to aggregate the semantic distance of constituents.…”

Section: Creativity Assessment and Semantic Distancementioning

confidence: 99%

See 1 more Smart Citation

A MAD method to assess idea novelty: Improving validity of automatic scoring using maximum associative distance (MAD).

Yu¹,

Beaty²,

Forthmann³

et al. 2023

Psychology of Aesthetics, Creativity, and the Arts

View full text Add to dashboard Cite

Creativity research often relies on human raters to judge the novelty of participants' responses on openended tasks, such as the alternate uses task (AUT). Albeit useful, manual ratings are subjective and labor-intensive. To address these limitations, researchers increasingly use automatic scoring methods based on a natural language processing technique for quantifying the semantic distance between words. However, many methodological choices remain open on how to obtain semantic distance scores for ideas, which can significantly impact reliability and validity. In this project, we propose a new semantic distance-based method, maximum associative distance (MAD), for assessing response novelty in AUT. Within a response, MAD uses the semantic distance of the word that is maximally remote from the prompt word to reflect response novelty. We compare the results from MAD with other competing semantic distance-based methods, including element-wise multiplication-a commonly used compositional model-across three published datasets including a total of 447 participants. We found MAD to be more strongly correlated with human creativity ratings than the competing methods. In addition, MAD scores reliably predict external measures such as openness to experience. We further explored how idea elaboration affects the performance of various scoring methods and found that MAD is closely aligned with human raters in processing multiword responses. Thus, the MAD method improves the psychometrics of automatic creativity assessment, while also provides insights into what human raters perceive as creative about ideas.

show abstract

Section: Elaboration and Its Role In Creative Quality Assessmentmentioning

confidence: 75%

Section: Creativity Assessment and Semantic Distancementioning

confidence: 99%

A MAD method to assess idea novelty: Improving validity of automatic scoring using maximum associative distance (MAD).

Yu¹,

Beaty²,

Forthmann³

et al. 2023

Psychology of Aesthetics, Creativity, and the Arts

View full text Add to dashboard Cite

show abstract

“…Within the recent literature on DT, especially in regard to text‐mining‐based scoring, there has been much debate concerning what the true relation among Elaboration and other dimensions of DT should be, and what empirical correlations may indicate bias in scores (Forthamann et al, 2018; Maio et al, 2020). Here, we add to that ongoing discussion by showing that text‐mining‐based Elaboration scores displayed very different correlations with different dimensions of DT.…”

Section: Discussionmentioning

confidence: 99%

Four Text‐Mining Methods for Measuring Elaboration

Dumas

Organisciak

Maio

et al. 2020

Journal of Creative Behavior

Self Cite

View full text Add to dashboard Cite

When individuals engage in divergent thinking, they vary on their Elaboration, or the degree to which they explain and embellish their responses. Although Elaboration has been considered relevant to creativity research for decades, its measurement has remained under‐developed. Here, we leverage technical and methodological perspectives from the text‐mining literature to posit four methods for quantifying elaboration: Unweighted Word Count, Stoplisted Inclusion, Part of Speech Inclusion, and Inverse Frequency Weighting. Although the Unweighted Word Count method is becoming typical in the field, more complex weighting methods appear to better fit the conceptualization of Elaboration. We explain the benefits of each of the included methods and demonstrate their application to responses from the Alternate Uses Task: showing that all four of these text‐mining methods produced Elaboration scores with high levels of reliability, but the Stoplisted Inclusion method appeared to maximize the score validity both in terms of criteria correlations and power to discriminate among creative experts and non‐experts. We offer an open‐access module for creativity researchers to apply these methods to their own data via our laboratory website [https://openscoring.du.edu/].

show abstract

“…Following Wilson et al (1953), human judges rated each valid response on a 5-point scale according to its uniqueness (1 = not at all unique, 5 = highly unique) according to two criteria: originality (how frequently it appeared in the sample) and remoteness (how conceptually distant it was from common responses). While some more recent research supports the use of this subjective method (e.g., Dumas et al, 2020) the robustness of judge-based scores has also been questioned (e.g., Maio et al, 2020). We therefore also used SemDis to compute a more objective creativity score for every participant in each round based on the total semantic distance between their responses and the task subject ('pandemic').…”

Section: Resultsmentioning

confidence: 99%