2021
DOI: 10.48550/arxiv.2112.09174
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning Bounded Context-Free-Grammar via LSTM and the Transformer:Difference and Explanations

Abstract: Long Short-Term Memory (LSTM) and Transformers are two popular neural architectures used for natural language processing tasks. Theoretical results show that both are Turingcomplete and can represent any context-free language (CFL). In practice, it is often observed that Transformer models have better representation power than LSTM. But the reason is barely understood. We study such practical differences between LSTM and Transformer and propose an explanation based on their latent space decomposition patterns.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 18 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?