Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet

Topal, M. Onat; Baş, Anıl; Heerden, Imke van

doi:10.48550/arxiv.2102.08036

Cited by 18 publications

(19 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…This allows BERT to take the context of each word into account. Similar in its construction, XLNet [32] improved its masking mechanism with peculiar assumptions during its pre-training stage, and improved over the work done by BERT. Despite these advances, studies [8] have shown that even these approaches have still struggled with negation.…”

Section: Word Negation and Sequence Labelingmentioning

confidence: 99%

“…The growing complexity of text data has made NLP applications increasingly vital in analyzing large volumes of data [10,14]. NLP solutions are one of the backbones of some intelligent models like BERT, XLNet, and GPT, which have advanced the cause of sentiment analysis [21,32], and machine language translation [27]. NLP-enabled models propel the understanding of language structure and interpretation.…”

Section: Natural Language Processing (Nlp)mentioning

confidence: 99%

See 1 more Smart Citation

A Semantic Approach to Negation Detection and Word Disambiguation with Natural Language Processing

Okpala¹,

Rodriguez²,

Tapia³

et al. 2023

Preprint

View full text Add to dashboard Cite

This study aims to demonstrate the methods for detecting negations in a sentence by uniquely evaluating the lexical structure of the text via word-sense disambiguation. The proposed framework examines all the unique features in the various expressions within a text to resolve the contextual usage of all tokens and decipher the effect of negation on sentiment analysis. The application of popular expression detectors skips this important step, thereby neglecting the root words caught in the web of negation and making text classification difficult for machine learning and sentiment analysis. This study adopts the Natural Language Processing (NLP) approach to discover and antonimize words that were negated for better accuracy in text classification using a knowledge base provided by an NLP library called WordHoard. Early results show that our initial analysis improved on traditional sentiment analysis, which sometimes neglects negations or assigns an inverse polarity score. The SentiWordNet analyzer was improved by 35%, the Vader analyzer by 20% and the TextBlob by 6%. CCS CONCEPTS• Computing methodologies → Natural language processing.

show abstract

Section: Word Negation and Sequence Labelingmentioning

confidence: 99%

Section: Natural Language Processing (Nlp)mentioning

confidence: 99%

A Semantic Approach to Negation Detection and Word Disambiguation with Natural Language Processing

Okpala¹,

Rodriguez²,

Tapia³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…The pre-trained language models (PrLM) [13] have reached remarkable achievements in learning universal natural language representations by pre-training large language models on massive general corpus and fine-tuning them on downstream tasks. BERT [14], which is derived from the Transformer's encoder, is the most representative among PrLMs [15], the multi-head self-attention in the Transformer is a vital mechanism, it is essentially a variant of the graph attention network [16] (GAT).…”

Section: Related Work a Pre-trained Language Modelsmentioning

confidence: 99%

Dialogue Logic Aware and Key Utterance Decoupling Model for Multi-Party Dialogue Reading Comprehension

Yang

Gao

et al. 2023

IEEE Access

View full text Add to dashboard Cite

Multi-party dialogue machine reading comprehension (MRC) brings an unprecedented challenge due to the multiple speakers and the complex discourse linkages among speaker-aware utterances. The majority of current methods only consider the textual aspects of dialogue situations, and pay little attention to crucial speaker-aware cues. This prevents a model from capturing the speaker's intention and important discourse information for questions in a complex discourse relationship, leading to the model giving wrong answers. In this paper, we construct a dialogue logic graph module by the relational graph convolutional network (R-GCN) to structure the dialogue information, and design a speaker prediction task to enhance the ability to capture discourse logic. Additionally, we construct a key utterance information decoupling module that focuses on the key discourse information flow involve questions, and filters out noise information. Extensive experiments FriendsQA and Molweni show that our approach outperforms competitive baselines and current state-of-the-art models, especially when dealing with more rounds of dialogue and questions involving people, events and time.

show abstract

“…Transformer (Vaswani et al, 2017), an alternative to convolutional neural networks, has dominated the field of natural language processing (NLP), including speech recognition (Dong et al, 2018), synthesis (Li et al, 2019b), text to speech translation (Vila et al, 2018), and natural language generation (Topal et al, 2021). As a example of deep learning architectures, Transformer was first introduced to handle sequential inference tasks in NLP.…”

Section: Introductionmentioning

confidence: 99%

MultiNet with Transformers: A Model for Cancer Diagnosis Using Images

Barzekar¹,

Patel²,

Tong³

et al. 2023

Preprint

View full text Add to dashboard Cite

Cancer is a leading cause of death in many countries. An early diagnosis of cancer based on biomedical imaging ensures effective treatment and a better prognosis. However, biomedical imaging presents challenges to both clinical institutions and researchers. Physiological anomalies are often characterized by slight abnormalities in individual cells or tissues, making them difficult to detect visually. Traditionally, anomalies are diagnosed by radiologists and pathologists with extensive training. This procedure, however, demands the participation of professionals and incurs a substantial cost. The cost makes large-scale biological image classification impractical. In this study, we provide unique deep neural network designs for multiclass classification of medical images, in particular cancer images. We incorporated transformers into a multiclass framework to take advantage of data-gathering capability and perform more accurate classifications. We evaluated models on publicly accessible datasets using various measures to ensure the reliability of the models. Extensive assessment metrics suggest this method can be used for a multitude of classification tasks.

show abstract

Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet

Cited by 18 publications

References 0 publications

A Semantic Approach to Negation Detection and Word Disambiguation with Natural Language Processing

A Semantic Approach to Negation Detection and Word Disambiguation with Natural Language Processing

Dialogue Logic Aware and Key Utterance Decoupling Model for Multi-Party Dialogue Reading Comprehension

MultiNet with Transformers: A Model for Cancer Diagnosis Using Images

Contact Info

Product

Resources

About