2021
DOI: 10.20944/preprints202107.0252.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Review of Recurrent Neural Network Architecture for Sequence Learning: Comparison between LSTM and GRU

Abstract: Deep neural networks (DNNs) have made a huge impact in the field of machine learning by providing unbeatable humanlike performance to solve real-world problems such as image processing and natural language processing (NLP). Convolutional neural network (CNN) and recurrent neural network (RNN) are two typical architectures that are widely used to solve such problems. Time sequence-dependent problems are generally very challenging, and RNN architectures have made an enormous improvement in a wide range of machin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 33 publications
(13 citation statements)
references
References 19 publications
0
13
0
Order By: Relevance
“…A gated recurrent unit (GRU) is an RNN variant that was originally designed to solve the problem of disappearing gradients in standard RNNs [ 15 ]. The structure is shown in Figure 2 .…”
Section: Methodsmentioning
confidence: 99%
“…A gated recurrent unit (GRU) is an RNN variant that was originally designed to solve the problem of disappearing gradients in standard RNNs [ 15 ]. The structure is shown in Figure 2 .…”
Section: Methodsmentioning
confidence: 99%
“…Gated Recurrent Unit (GRU) was introduced in 2014 as a solution to LSTM's complexity and as a solution to the vanishing gradient problem [41,42]. Moreover, by implementing gating mechanisms within their networks, GRU and LSTM can capture and propagate information over long sequences.…”
Section: Gated Neural Network (Gru)mentioning
confidence: 99%
“…The Bidirectional Encoder Representations from Transformers (BERT) model has been under development since Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova first published "All Attention You Need" in 2018 at Google Labs [42]. The model is an embedding layer of pre-trained bidirectional representations from a large collection of unsupervised text corpora, including Wikipedia and BookCorpus [42] As shown in Figure 4, the first step for building the BERT-based embedding method is to import the required library HuggingFACE and define the pre-trained BERT model to be used. In our case we took different considerations, firstly, the majority of resumes are written in French, and secondly, the average length of the sequence.…”
Section: The Bidirectional Encoder Representations From Transformers ...mentioning
confidence: 99%
See 1 more Smart Citation
“…The shortcomings of such a method, are the short memory and vanishing gradient problems [35][36][37]. Moreover, several types of recurrent neural networks have shown success with seizure prediction, such as bidirectional long-short term memory (Bi-LSTM), which solves the problem of short memory by storing sequencing of necessary data and throwing away unneeded data [38][39][40]. Additionally, raw EEG signals are converted into images and used in CNNs, which act as a classifier [41,42], this method is closest to practice by medical practitioners where visual features of seizures could be extracted using various image classifiers such as ImageNet [43] and DenseNet [44].…”
Section: Related Workmentioning
confidence: 99%