2023
DOI: 10.3390/math11112451
|View full text |Cite
|
Sign up to set email alerts
|

A Mathematical Interpretation of Autoregressive Generative Pre-Trained Transformer and Self-Supervised Learning

Abstract: In this paper, we present a rigorous mathematical examination of generative pre-trained transformer (GPT) models and their autoregressive self-supervised learning mechanisms. We begin by defining natural language space and knowledge space, which are two key concepts for understanding the dimensionality reduction process in GPT-based large language models (LLMs). By exploring projection functions and their inverses, we establish a framework for analyzing the language generation capabilities of these models. We … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(6 citation statements)
references
References 24 publications
0
6
0
Order By: Relevance
“…The emergence of deep learning, a subset of machine learning, has been particularly transformative in this regard. In the domain of deep learning, a network learns from the data by adjusting its internal parameters [91][92][93][94]. In this way, the model is able to perform complex tasks such as complex games with reinforcement learning [95][96][97] and, as we have explored in this review, gene expression data analysis.…”
Section: Discussionmentioning
confidence: 99%
“…The emergence of deep learning, a subset of machine learning, has been particularly transformative in this regard. In the domain of deep learning, a network learns from the data by adjusting its internal parameters [91][92][93][94]. In this way, the model is able to perform complex tasks such as complex games with reinforcement learning [95][96][97] and, as we have explored in this review, gene expression data analysis.…”
Section: Discussionmentioning
confidence: 99%
“…The GPT model is based on the Transformer architecture, which involves several key components, like Input Embedding and Positional Encoding, Transformer Blocks, Feed-Forward Neural Network, Normalization and Residual Connections, and Output layer [71].…”
Section: Methodsmentioning
confidence: 99%
“…This mathematical framework enables GPT to capture complex patterns and relationships in sequential data [71] and is used in this study to generate synthetic patient discharge messages and even perform analysis on those discharge messages for assessing severity and chances of hospital readmission.…”
Section: Output Layermentioning
confidence: 99%
“…On the same path, Jiang et al [19] investigated the ability of GPTs to express personality traits and gender differences. Additionally, there are studies that discuss the potential implications of GPTs in intellectual property and plagiarism [20], as well as the limitations and challenges of GPT models and their learning mechanisms [21]. Other studies focused on the use of advanced techniques in art conservation [22], on-site interpretation and presentation planning for cultural heritage sites [23], and the development of a thesaurus in an educational web platform on optical and laser-based investigation methods for cultural heritage analysis and diagnosis [24].…”
Section: Related Workmentioning
confidence: 99%