2015
DOI: 10.1177/0265532215587391
|View full text |Cite
|
Sign up to set email alerts
|

Construct validity in TOEFL iBT speaking tasks: Insights from natural language processing

Abstract: This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these tasks differ with regard to linguistic characteristics. The findings lend support to using a variety of speaking tasks to assess speaking proficiency. Namely, with reg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
21
1
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 35 publications
(25 citation statements)
references
References 28 publications
2
21
1
1
Order By: Relevance
“…In addition, words integrated from the source texts into the responses were highly predictive of speaking quality. Finally, Kyle, Crossley, and McNamara (2016) examined differences between L2 speakers' responses to both independent and integrated speaking tasks and found that independent speaking tasks result in less given information (i.e., information available previously in the discourse). Although studies examining cohesion and spoken discourse in terms of comprehension and processing are infrequent, the existing studies above indicate that cohesion features in discourse, as well as cohesive links between the speaking prompt/source text and a response, are important components of speaking quality.…”
Section: Assessing Discourse Cohesionmentioning
confidence: 99%
“…In addition, words integrated from the source texts into the responses were highly predictive of speaking quality. Finally, Kyle, Crossley, and McNamara (2016) examined differences between L2 speakers' responses to both independent and integrated speaking tasks and found that independent speaking tasks result in less given information (i.e., information available previously in the discourse). Although studies examining cohesion and spoken discourse in terms of comprehension and processing are infrequent, the existing studies above indicate that cohesion features in discourse, as well as cohesive links between the speaking prompt/source text and a response, are important components of speaking quality.…”
Section: Assessing Discourse Cohesionmentioning
confidence: 99%
“…Some previous research on rater behavior has demonstrated a considerable amount of rater variability, which is mostly related to raters' characteristics and not the test takers' performance (e.g., Carey, Mannell, & Dunn, 2011;Knoch, 2011). Accordingly, several research studies have been carried out to investigate the accuracy of oral performance assessment tests through research on both rater reliability, which is the degree of agreement among independent raters in assessing the test takers (Khabbazbashi, 2017;Winke & Gass, 2013), and the validity of performancebased tests through construct validity and concurrent validity (Kyle, Crossley & McNamara, 2016). Furthermore, rater training has demonstrated low impact in reducing this variability.…”
Section: Rater Behavior In Oral Performance Assessmentmentioning
confidence: 99%
“…Second, it can help researchers administer rater training programs. Research has shown that rater consistency and rating validity can be increased through training (Kyle, Crossley, & McNamara, 2016). Third, MFRM can help reduce self-inconsistency and increase intra-rater reliability, which increases the fairness of a test, specifically in placement and summative evaluation tests (Gan, 2010).…”
Section: Rater Behavior In Oral Performance Assessmentmentioning
confidence: 99%
“…(1) (e.g., Cumming, 2013) (Hirai & Koizumi, 2009;Swain, Huang, Barkaoui, Brooks, & Lapkin, 2009) 3 (Murray, et al, 2012) 3 Brown, Iwashita, & McNamara (2005) 4 TOEFL iBT (e.g., Kyle, Crossley, & McNamara, 2015) Hirai and Koizumi (2008,2013) EBB (Empirically derived, Binary-choice, Boundary-definition)…”
Section: 35mentioning
confidence: 99%