The teaching effect and learning state of teachers are significantly influenced by many emotions that are manifested in teaching behaviour. The affective recognition model can be applied to the analysis of useful teaching feedback data found in teaching behaviour data to assist teachers in raising the level of instruction they provide. The accuracy of emotion classification is impacted by the typical emotion recognition model's inability to completely distinguish the intricate emotional aspects and hints in instructional conduct. In order to improve the performance of emotion classification, this paper proposes a multi-modal emotion recognition model of teaching behaviour based on dynamic convolution and residual gating. This model enhances the performance of emotion classification by further mining advanced local features and creating efficient interactive fusion strategies. First, text, audio, and images' low-level features, high-level local characteristics, and context dependencies are each extracted. Second, cross-modal dynamic convolution (CMDC) is employed to represent the interaction between modes and within modes, simulate the interaction between lengthy time series, capture the interaction properties of various modes, and prevent the obliteration of crucial information. The experimental results demonstrate that this model performs better than other comparable models in terms of accuracy of emotion categorization and F1 value on the self-built data set, reaching 83.5% and 83.1%, respectively. It has been demonstrated that the emotion classification model can help teachers become more effective over time by giving them a framework on which to analyse teaching behaviour with objectivity.
The teaching effect and learning state of teachers are signi cantly in uenced by many emotions that are manifested in teaching behaviour. The affective recognition model can be applied to the analysis of useful teaching feedback data found in teaching behaviour data to assist teachers in raising the level of instruction they provide. The accuracy of emotion classi cation is impacted by the typical emotion recognition model's inability to completely distinguish the intricate emotional aspects and hints in instructional conduct. In order to improve the performance of emotion classi cation, this paper proposes a multi-modal emotion recognition model of teaching behaviour based on dynamic convolution and residual gating. This model enhances the performance of emotion classi cation by further mining advanced local features and creating e cient interactive fusion strategies. First, text, audio, and images' low-level features, high-level local characteristics, and context dependencies are each extracted. Second, cross-modal dynamic convolution (CMDC) is employed to represent the interaction between modes and within modes, simulate the interaction between lengthy time series, capture the interaction properties of various modes, and prevent the obliteration of crucial information. The experimental results demonstrate that this model performs better than other comparable models in terms of accuracy of emotion categorization and F1 value on the self-built data set, reaching 83.5% and 83.1%, respectively. It has been demonstrated that the emotion classi cation model can help teachers become more effective over time by giving them a framework on which to analyse teaching behaviour with objectivity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.