2023
DOI: 10.48550/arxiv.2303.05398
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MathPrompter: Mathematical Reasoning using Large Language Models

Abstract: Large Language Models (LLMs) have limited performance when solving arithmetic reasoning tasks and often provide incorrect answers. Unlike natural language understanding, math problems typically have a single correct answer, making the task of generating accurate solutions more challenging for LLMs. To the best of our knowledge, we are not aware of any LLMs that indicate their level of confidence in their responses which fuels a trust deficit in these models impeding their adoption. To address this deficiency, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(17 citation statements)
references
References 17 publications
0
17
0
Order By: Relevance
“…This approach successfully solves, explains and generates math problems at the university level. MathPrompter [66] employs a zero-shot chainof-thought prompting technique to generate multiple algebraic expressions or Python functions in varied ways to solve the same math problem, enhancing reliance on output results. PAL [44] introduces an innovative approach to bolster the performance of pre-trained language models (PLMs) in mathematical problem-solving.…”
Section: Tool-based Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…This approach successfully solves, explains and generates math problems at the university level. MathPrompter [66] employs a zero-shot chainof-thought prompting technique to generate multiple algebraic expressions or Python functions in varied ways to solve the same math problem, enhancing reliance on output results. PAL [44] introduces an innovative approach to bolster the performance of pre-trained language models (PLMs) in mathematical problem-solving.…”
Section: Tool-based Methodsmentioning
confidence: 99%
“…Recent research has highlighted the growing capabilities of LLMs in the field of mathematical word problem solving, emphasizing the trend toward more nuanced and sophisticated AI-driven mathematical analysis. MathPrompter [66] uses LLM called GPT3 DaVinci to solve MWPs with excellent results, demonstrating the potential of LLMs to not only explain but also generate complex mathematical reasoning, reflecting a human-like understanding of complex problem sets.…”
Section: Math Problem Solvingmentioning
confidence: 99%
“…Previous work, such as that of Frieder et al (2023), has shown that advanced LLMs -specifically ChatGPT -tend to be highly inconsistent on mathematics tasks. Similarly, Imani et al (2023) found that hallucinations tend to be amplified when models attempt mathematical reasoning. We believe that equations and puzzles are useful testing grounds because the quality of the model's answers can be objectively evaluated.…”
Section: Introductionmentioning
confidence: 94%
“…CoT summarization is related to several techniques that ask the LLM to outline its "thinking" before arriving at a final implementation (Wei et al 2022;Jiang et al 2023;Zheng et al 2023). A number of recent works also use programs as prompts (i.e., a structured chain of thought) in an attempt to help LLMs perform mathematical reasoning (Gao et al 2022;Imani, Du, and Shrivastava 2023). Related to our automated debugging, Xia and Zhang 2023a) consider a related paradigm, but where feedback comes from humans, rather than automated checks.…”
Section: Introductionmentioning
confidence: 99%