A Literature Review on Bidirectional Encoder Representations from Transformers

Shreyashree, S.; Sunagar, Pramod; Rajarajeswari, S.; Kanavalli, Anita

doi:10.1007/978-981-16-6723-7_23

Cited by 12 publications

(6 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As another big trend, techniques that have been successfully used in natural language processing have been introduced. One of them is the use of (multi-head) self-attention mechanism, which was first introduced in Transformer (reviewed in Shreyashree et al, 2022 ). Both Jiang et al and Cong et al report the improvement of prediction performance with the use of the self-attention mechanism ( Jiang, Wang, Yao, et al, 2021 ; Cong et al, 2022 ).…”

Section: Deep Learning and Language Model-based Methodsmentioning

confidence: 99%

Recent Advances in the Prediction of Subcellular Localization of Proteins and Related Topics

Nakai

Wei

2022

Front. Bioinform.

View full text Add to dashboard Cite

Prediction of subcellular localization of proteins from their amino acid sequences has a long history in bioinformatics and is still actively developing, incorporating the latest advances in machine learning and proteomics. Notably, deep learning-based methods for natural language processing have made great contributions. Here, we review recent advances in the field as well as its related fields, such as subcellular proteomics and the prediction/recognition of subcellular localization from image data.

show abstract

Section: Deep Learning and Language Model-based Methodsmentioning

confidence: 99%

Recent Advances in the Prediction of Subcellular Localization of Proteins and Related Topics

Nakai

Wei

2022

Front. Bioinform.

View full text Add to dashboard Cite

show abstract

“…These and many other pre-trained models have one disadvantage in common, which is the unidirectional that restricts the power of those models. This led to the discovery of a specific type of transformer networks [34], BERT [35][36][37]. It is a pre-trained model that considers the left and right context of a word in all layers, while generating embeddings.…”

Section: A Architecture Of Proposed and Related Modelsmentioning

confidence: 99%

BERT-Based Hybrid RNN Model for Multi-class Text Classification to Study the Effect of Pre-trained Word Embeddings

Shreyashree¹,

Sunagar²,

Rajarajeswari³

et al. 2022

IJACSA

Self Cite

View full text Add to dashboard Cite

Due to the Covid-19 pandemic which started in the year 2020, many nations had imposed lockdown to curb the spread of this virus. People have been sharing their experiences and perspectives on social media on the lockdown situation. This has given rise to increased number of tweets or posts on social media. Multi-class text classification, a method of classifying a text into one of the pre-defined categories, is one of the effective ways to analyze such data that is implemented in this paper. A Covid-19 dataset is used in this work consisting of fifteen predefined categories. This paper presents a multi-layered hybrid model, LSTM followed by GRU, to integrate the benefits of both the techniques. The advantages of word embeddings techniques like GloVe and BERT have been implemented and found that, for three epochs, the transfer learning based pre-trained BERThybrid model performs one percent better than GloVe-hybrid model but the state-of-the-art, fine-tuned BERT-base model outperforms the BERT-hybrid model by three percent, in terms of validation loss. It is expected that, over a larger number of epochs, the hybrid model might outperform the fine-tuned model.

show abstract

“…Shreyashree et al [17] presented "transfer learning" which is a method of creating a model for a specific problem and then utilizing it to create a model for a different problem. It has been proven to be quite successful.…”

Section: Introductionmentioning

confidence: 99%

Exploratory analysis on the natural language processing models for task specific purposes

Shidaganti,

Shetty,

Edara

et al. 2024

Bulletin EEI

View full text Add to dashboard Cite

Natural language processing (NLP) is a technology that has become widespread in the area of human language understanding and analysis. A range of text processing tasks such as summarisation, semantic analysis, classification, question-answering, and natural language inference are commonly performed using it. The dilemma of picking a model to help us in our task is still there. It’s becoming an impediment. This is where we are trying to determine which modern NLP models are better suited for the tasks set out above in order to compare them with datasets like SQuAD and GLUE. For comparison, BERT, RoBERTa, distilBERT, BART, ALBERT, and text-to-text transfer transformer (T5) models have been used in this study. The aim is to understand the underlying architecture, its effects on the use case and also to understand where it falls short. Thus, we were able to observe that RoBERTa was more effective against the models ALBERT, distilBERT, and BERT in terms of tasks related to semantic analysis, natural language inference, and question-answering. The reason is due to the dynamic masking present in RoBERTa. For summarisation, even though BART and T5 models have very similar architecture the BART model has performed slightly better than the T5 model.

show abstract

A Literature Review on Bidirectional Encoder Representations from Transformers

Cited by 12 publications

References 7 publications

Recent Advances in the Prediction of Subcellular Localization of Proteins and Related Topics

Recent Advances in the Prediction of Subcellular Localization of Proteins and Related Topics

BERT-Based Hybrid RNN Model for Multi-class Text Classification to Study the Effect of Pre-trained Word Embeddings

Exploratory analysis on the natural language processing models for task specific purposes

Contact Info

Product

Resources

About