Science education researchers typically face a trade-off between more quantitatively oriented confirmatory testing of hypotheses, or more qualitatively oriented exploration of novel hypotheses. More recently, open-ended, constructed response items were used to combine both approaches and advance assessment of complex science-related skills and competencies. For example, research in assessing science teachers’ noticing and attention to classroom events benefitted from more open-ended response formats because teachers can present their own accounts. Then, open-ended responses are typically analyzed with some form of content analysis. However, language is noisy, ambiguous, and unsegmented and thus open-ended, constructed responses are complex to analyze. Uncovering patterns in these responses would benefit from more principled and systematic analysis tools. Consequently, computer-based methods with the help of machine learning and natural language processing were argued to be promising means to enhance assessment of noticing skills with constructed response formats. In particular, pretrained language models recently advanced the study of linguistic phenomena and thus could well advance assessment of complex constructs through constructed response items. This study examines potentials and challenges of a pretrained language model-based clustering approach to assess preservice physics teachers’ attention to classroom events as elicited through open-ended written descriptions. It was examined to what extent the clustering approach could identify meaningful patterns in the constructed responses, and in what ways textual organization of the responses could be analyzed with the clusters. Preservice physics teachers (N = 75) were instructed to describe a standardized, video-recorded teaching situation in physics. The clustering approach was used to group related sentences. Results indicate that the pretrained language model-based clustering approach yields well-interpretable, specific, and robust clusters, which could be mapped to physics-specific and more general contents. Furthermore, the clusters facilitate advanced analysis of the textual organization of the constructed responses. Hence, we argue that machine learning and natural language processing provide science education researchers means to combine exploratory capabilities of qualitative research methods with the systematicity of quantitative methods.
Science education researchers have developed a refined understanding of the structure of science teachers’ pedagogical content knowledge (PCK), but how to develop applicable and situation-adequate PCK remains largely unclear. A potential problem lies in the diverse conceptualisations of the PCK used in PCK research. This study sought to systematize existing science education research on PCK through the lens of the recently proposed refined consensus model (RCM) of PCK. In this review, the studies’ approaches to investigating PCK and selected findings were characterised and synthesised as an overview comparing research before and after the publication of the RCM. We found that the studies largely employed a qualitative case-study methodology that included specific PCK models and tools. However, in recent years, the studies focused increasingly on quantitative aspects. Furthermore, results of the reviewed studies can mostly be integrated into the RCM. We argue that the RCM can function as a meaningful theoretical lens for conceptualizing links between teaching practice and PCK development by proposing pedagogical reasoning as a mechanism and/or explanation for PCK development in the context of teaching practice.
Computer-based analysis of preservice teachers’ written reflections could enable educational scholars to design personalized and scalable intervention measures to support reflective writing. Algorithms and technologies in the domain of research related to artificial intelligence have been found to be useful in many tasks related to reflective writing analytics such as classification of text segments. However, mostly shallow learning algorithms have been employed so far. This study explores to what extent deep learning approaches can improve classification performance for segments of written reflections. To do so, a pretrained language model (BERT) was utilized to classify segments of preservice physics teachers’ written reflections according to elements in a reflection-supporting model. Since BERT has been found to advance performance in many tasks, it was hypothesized to enhance classification performance for written reflections as well. We also compared the performance of BERT with other deep learning architectures and examined conditions for best performance. We found that BERT outperformed the other deep learning architectures and previously reported performances with shallow learning algorithms for classification of segments of reflective writing. BERT starts to outperform the other models when trained on about 20 to 30% of the training data. Furthermore, attribution analyses for inputs yielded insights into important features for BERT’s classification decisions. Our study indicates that pretrained language models such as BERT can boost performance for language-related tasks in educational contexts such as classification.
IntroductionScience educators use writing assignments to assess competencies and facilitate learning processes such as conceptual understanding or reflective thinking. Writing assignments are typically scored with holistic, summative coding rubrics. This, however, is not very responsive to the more fine-grained features of text composition and represented knowledge in texts, which might be more relevant for adaptive guidance and writing-to-learn interventions. In this study we examine potentials of machine learning (ML) in combination with natural language processing (NLP) to provide means for analytic, formative assessment of written reflections in science teacher education.MethodsML and NLP are used to filter higher-level reasoning sentences in physics and non-physics teachers’ written reflections on a standardized teaching vignette. We particularly probe to what extent a previously trained ML model can facilitate the filtering, and to what extent further fine-tuning of the previously trained ML model can enhance performance. The filtered sentences are then clustered with ML and NLP to identify themes and represented knowledge in the teachers’ written reflections.ResultsResults indicate that ML and NLP can be used to filter higher-level reasoning elements in physics and non-physics preservice teachers’ written reflections. Furthermore, the applied clustering approach yields specific topics in the written reflections that indicate quality differences in physics and non-physics preservice teachers’ texts.DiscussionOverall, we argue that ML and NLP can enhance writing analytics in science education. For example, previously trained ML models can be utilized in further research to filter higher-level reasoning sentences, and thus provide science education researchers efficient mean to answer derived research questions.
Reflection is hypothesized to be a key component for teachers’ professional development and is often assessed and facilitated through written reflections in university-based teacher education. Empirical research shows that reflection-related competencies are domain-dependent and multi-faceted. However, assessing reflections is complex. Given this complexity, novel methodological tools such as non-linear, algorithmic models can help explore unseen relationships and better determine quality correlates for written reflections. Consequently, this study utilized machine learning methods to explore quality correlates for written reflections in physics on a standardized teaching situation. N = 110 pre- and in-service physics teachers were instructed to reflect upon a standardized teaching situation in physics displayed in a video vignette. The teachers’ written reflections were analyzed with a machine learning model which classified sentences in the written reflections according to elements in a reflection-supporting model. A quality indicator called level of structure (LOS) was devised and further used to validate machine learning classifications against experts’ judgements. Analyses show that LOS is positively correlated with experts’ judgements on reflection quality. We conclude that LOS of a written reflection is one important indicator for high-quality written reflections which is able to exclude typical quality correlates such as text length. With the help of the machine learning model, LOS can be useful to assess pre-service physics teachers written reflections.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.