This paper presents an end-to-end algorithm for solving circuit problems in secondary physics. A key challenge in solving circuit problems is to automatically understand circuit problems over the modals of both text and schematic. Existing methods have a limited capacity in problem understanding due to the they cannot deal with the numerous expressions of problems in natural language and the various circuit diagrams. In fact that this paper, a batch of methods is proposed to work against the challenge of solving circuit problems. The problem understanding is modeled as a problem of relation extraction and a scheme is proposed to extract relations from both text and schematic. A syntax–semantics model is adopted to extract explicit relations from text, whereas a unit-theorem-based method is proposed to extract implicit relations. And a mesh search method is proposed to extract relations from schematic. Based on the result of problem understanding, an algorithm is proposed to produce the solutions of circuit problems, in which the solutions are presented in a readable way. The experimental results demonstrate the effectiveness of the proposed algorithm in solving circuit problems. To the best of our knowledge, this paper is the first literature which reports the quantitative results in understanding and solving circuit problems.
This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data. A key challenge in solving geometry problems using deep learning is to automatically adapt to the task of understanding single-modal and multimodal problems. Existing methods either focus on single-modal or multimodal problems, and they cannot fit each other. A general geometry problem solver should obviously be able to process various modal problems at the same time. In this paper, a shared feature-learning model of multimodal data is adopted to learn the unified feature representation of text and image, which can solve the heterogeneity issue between multimodal geometry problems. A contrastive learning model of multimodal data enhances the semantic relevance between multimodal features and maps them into a unified semantic space, which can effectively adapt to both single-modal and multimodal downstream tasks. Based on the feature extraction and fusion of multimodal data, a proposed geometry problem solver uses relation extraction, theorem reasoning, and problem solving to present solutions in a readable way. Experimental results show the effectiveness of the method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.