An Empirical Study for the Statistical Adjustment of Rater Bias

İlhan, Mustafa

doi:10.21449/ijate.533517

International Journal of Assessment Tools in Education

2019

DOI: 10.21449/ijate.533517

|View full text |Cite

An Empirical Study for the Statistical Adjustment of Rater Bias

Mustafa İlhan

Abstract: This study investigated the effectiveness of statistical adjustments applied to rater bias in many-facet Rasch analysis. Some changes were first made in the dataset that did not include rater × examinee bias to cause to have rater × examinee bias. Later, bias adjustment was applied to rater bias included in the data file, and the effectiveness of the statistical adjustment was further examined. The outcomes pertaining to the datasets with and without bias, and to which the bias adjustment was applied, were com… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2022

2024

Publication Types

Select...

Article2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models

Morris,

Holmes,

Choi

et al. 2024

Int J Artif Intell Educ

View full text Add to dashboard Cite

Recent developments in the field of artificial intelligence allow for improved performance in the automated assessment of extended response items in mathematics, potentially allowing for the scoring of these items cheaply and at scale. This study details the grand prize-winning approach to developing large language models (LLMs) to automatically score the ten items in the National Assessment of Educational Progress (NAEP) Math Scoring Challenge. The approach uses extensive preprocessing that balanced the class labels for each item. This was done by identifying and filtering over-represented classes using a classifier trained on document-term matrices and data augmentation of under-represented classes using a generative pre-trained large language model (Grammarly’s Coedit-XL; Raheja et al., 2023). We also use input modification schemes that were hand-crafted to each item type and included information from parts of the multi-step math problem students had to solve. Finally, we finetune several pre-trained large language models on the modified input for each individual item in the NAEP automated math scoring challenge, with DeBERTa (He et al., 2021a) showing the best performance. This approach achieved human-like agreement (less than QWK 0.05 difference from human–human agreement) on nine out of the ten items in a held-out test set.

show abstract

Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models

Morris,

Holmes,

Choi

et al. 2024

Int J Artif Intell Educ

View full text Add to dashboard Cite

show abstract

Evaluating Simultaneous Group Activities Through Self- and Peer-Assessment: Addressing the "Evaluation Challenge" in Active Learning

Murphy

2022

Journal of Political Science Education

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

An Empirical Study for the Statistical Adjustment of Rater Bias

Cited by 2 publications

References 18 publications

Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models

Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models

Evaluating Simultaneous Group Activities Through Self- and Peer-Assessment: Addressing the "Evaluation Challenge" in Active Learning

Contact Info

Product

Resources

About