2021
DOI: 10.48550/arxiv.2110.08464
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems

Abstract: Math Word Problem (MWP) solving needs to discover the quantitative relationships over natural language narratives. Recent work shows that existing models memorize procedures from context and rely on shallow heuristics to solve MWPs. In this paper, we look at this issue and argue that the cause is a lack of overall understanding of MWP patterns. We first investigate how a neural network understands patterns only from semantics, and observe that, if the prototype equations like n 1 + n 2 are the same, most probl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…TAGOP [224], MT2Net [214], and DeductReasoner [74] utilize BERT or RoBERTa to extract the fundamental arithmetic relationships between quantities, enabling mathematical reasoning and operations. BERT-TD [94] utilizes semantic encoding and contrastive learning to cluster problems with similar prototype equations, thereby enhancing the understanding of MWP patterns.…”
Section: Non-autoregression Lmsmentioning
confidence: 99%
“…TAGOP [224], MT2Net [214], and DeductReasoner [74] utilize BERT or RoBERTa to extract the fundamental arithmetic relationships between quantities, enabling mathematical reasoning and operations. BERT-TD [94] utilizes semantic encoding and contrastive learning to cluster problems with similar prototype equations, thereby enhancing the understanding of MWP patterns.…”
Section: Non-autoregression Lmsmentioning
confidence: 99%
“…Besides the model architectures, there are also other interesting explorations, such as knowledge distillation (Zhang et al 2020a), situation model (Hong et al 2021b), syntax-semantics model (Lyu and Yu 2021), ,auxiliary training tasks (Qin et al 2021;Piekos, Michalewski, and Malinowski 2021;Liang and Zhang 2021), explicit value encoding (Wu et al 2021) and transfer learning (Alghamdi, Liang, and Zhang 2022). Recently, pre-trained language models (Yu et al 2021;Huang et al 2021;Shen et al 2021;Li et al 2021;Lan et al 2022; are widely applied to encode MWPs and become the strongest baselines in terms of MWP solving accuracy. in There are some other works (Ran et al 2019;Andor et al 2019;Chen et al 2020) considering the weak supervision environment in numerical understanding, however, the solution diversity in MWP is unique and under-explored.…”
Section: Related Work Math Word Problem Solvingmentioning
confidence: 99%