Natural Language Processing of Student's Feedback to Instructors: A Systematic Review

Sunar, Ayse Saliha; Khalid, Md Saifuddin

doi:10.1109/tlt.2023.3330531

Cited by 7 publications

(1 citation statement)

References 51 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Despite explorations like those mentioned above, research to date has not focused on the feasibility and quality of LLMs' results in performing a broad array of common qualitative education survey analysis tasks, leaving a gap that we focus on in this study. For example, a review published in 2024 focusing on natural language processing of students' feedback to instructors makes no mention of studies using LLMs for this purpose (Sunar & Khalid, 2024). Prior work has primarily focused on the use of encoder models like BERT for their clustering and feature extraction capabilities and have not explored the current generation of decoder-only auto-regressive models like the GPT models mentioned above.…”

Section: Llm Background and Related Researchmentioning

confidence: 99%

A Large Language Model Approach to Educational Survey Feedback Analysis

Parker,

Anderson,

Stone

et al. 2024

Int J Artif Intell Educ

View full text Add to dashboard Cite

This paper assesses the potential for the large language models (LLMs) GPT-4 and GPT-3.5 to aid in deriving insight from education feedback surveys. Exploration of LLM use cases in education has focused on teaching and learning, with less exploration of capabilities in education feedback analysis. Survey analysis in education involves goals such as finding gaps in curricula or evaluating teachers, often requiring time-consuming manual processing of textual responses. LLMs have the potential to provide a flexible means of achieving these goals without specialized machine learning models or fine-tuning. We demonstrate a versatile approach to such goals by treating them as sequences of natural language processing (NLP) tasks including classification (multi-label, multi-class, and binary), extraction, thematic analysis, and sentiment analysis, each performed by LLM. We apply these workflows to a real-world dataset of 2500 end-of-course survey comments from biomedical science courses, and evaluate a zero-shot approach (i.e., requiring no examples or labeled training data) across all tasks, reflecting education settings, where labeled data is often scarce. By applying effective prompting practices, we achieve human-level performance on multiple tasks with GPT-4, enabling workflows necessary to achieve typical goals. We also show the potential of inspecting LLMs’ chain-of-thought (CoT) reasoning for providing insight that may foster confidence in practice. Moreover, this study features development of a versatile set of classification categories, suitable for various course types (online, hybrid, or in-person) and amenable to customization. Our results suggest that LLMs can be used to derive a range of insights from survey text.

show abstract

Section: Llm Background and Related Researchmentioning

confidence: 99%