A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models

Stolfo, Alessandro; Zhijing, Jin,; Shridhar, Kumar; Schoelkopf, Bernhard; Sachan, Mrinmaya

doi:10.18653/v1/2023.acl-long.32

Cited by 5 publications

(1 citation statement)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A growing body of work has proposed methods to analyze the performance and robustness of large LMs on tasks involving mathematical reasoning (Pal and Baral, 2021;Piękos et al, 2021;Razeghi et al, 2022;Cobbe et al, 2021;Mishra et al, 2022). In this area, Stolfo et al (2023) use a causally-grounded approach to quantify the robustness of large LMs. However, the proposed formulation is limited to behavioral investigation with no insights into the models' inner mechanisms.…”

Section: Related Workmentioning

confidence: 99%

A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

Stolfo,

Belinkov,

Sachan

2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Mathematical reasoning in large language models (LMs) has garnered significant attention in recent work, but there is a limited understanding of how these models process and store information related to arithmetic tasks within their architecture. In order to improve our understanding of this aspect of language models, we present a mechanistic interpretation of Transformer-based LMs on arithmetic questions using a causal mediation analysis framework. By intervening on the activations of specific model components and measuring the resulting changes in predicted probabilities, we identify the subset of parameters responsible for specific predictions. This provides insights into how information related to arithmetic is processed by LMs. Our experimental results indicate that LMs process the input by transmitting the information relevant to the query from mid-sequence early layers to the final token using the attention mechanism. Then, this information is processed by a set of MLP modules, which generate result-related information that is incorporated into the residual stream. To assess the specificity of the observed activation dynamics, we compare the effects of different model components on arithmetic queries with other tasks, including number retrieval from prompts and factual knowledge questions. 1

show abstract

Section: Related Workmentioning

confidence: 99%

A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

Stolfo,

Belinkov,

Sachan

2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

Pre‐trained language models: What do they know?

Guimarães,

Campos,

Jorge

2023

WIREs Data Min & Knowl

View full text Add to dashboard Cite

Large language models (LLMs) have substantially pushed artificial intelligence (AI) research and applications in the last few years. They are currently able to achieve high effectiveness in different natural language processing (NLP) tasks, such as machine translation, named entity recognition, text classification, question answering, or text summarization. Recently, significant attention has been drawn to OpenAI's GPT models' capabilities and extremely accessible interface. LLMs are nowadays routinely used and studied for downstream tasks and specific applications with great success, pushing forward the state of the art in almost all of them. However, they also exhibit impressive inference capabilities when used off the shelf without further training. In this paper, we aim to study the behavior of pre‐trained language models (PLMs) in some inference tasks they were not initially trained for. Therefore, we focus our attention on very recent research works related to the inference capabilities of PLMs in some selected tasks such as factual probing and common‐sense reasoning. We highlight relevant achievements made by these models, as well as some of their current limitations that open opportunities for further research.This article is categorized under: Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining Technologies > Artificial Intelligence

show abstract

Counterfactual Thinking for Machines

Vallverdú

2024

Causality for Artificial Intelligence

View full text Add to dashboard Cite

A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models

Cited by 5 publications

References 0 publications

A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

Pre‐trained language models: What do they know?

Counterfactual Thinking for Machines

Contact Info

Product

Resources

About