Using Convolutional Neural Network with BERT for Intent Determination

He, Chu; Chen, Sibao; Huang, Shilei; Zhang, Jian; Song, Xiangfei

doi:10.1109/ialp48816.2019.9037668

Cited by 23 publications

(15 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For some NLP applications, however, a language model by itself is not sufficient for accomplishing a given downstream task, and it becomes necessary to expand the language model's overall architecture by stacking it with another form of neural network, for example using a convolutional neural network for language models targeting classification NLP tasks. For such application scenarios, the combination of the BERT language model and deep learning models such as recurrent neural networks or convolutional neural networks were shown to be effective in recent studies for capturing meaningful features from the available data [8,[38][39][40]. We utilize a similar approach for our ClaimsBERT classifier.…”

Section: Related Workmentioning

confidence: 99%

An Accuracy-Maximization Approach for Claims Classifiers in Document Content Analytics for Cybersecurity

Ameri

Hempel

Sharif

et al. 2022

JCP

View full text Add to dashboard Cite

This paper presents our research approach and findings towards maximizing the accuracy of our classifier of feature claims for cybersecurity literature analytics, and introduces the resulting model ClaimsBERT. Its architecture, after extensive evaluations of different approaches, introduces a feature map concatenated with a Bidirectional Encoder Representation from Transformers (BERT) model. We discuss deployment of this new concept and the research insights that resulted in the selection of Convolution Neural Networks for its feature mapping aspects. We also present our results showing ClaimsBERT to outperform all other evaluated approaches. This new claims classifier represents an essential processing stage within our vetting framework aiming to improve the cybersecurity of industrial control systems (ICS). Furthermore, in order to maximize the accuracy of our new ClaimsBERT classifier, we propose an approach for optimal architecture selection and determination of optimized hyperparameters, in particular the best learning rate, number of convolutions, filter sizes, activation function, the number of dense layers, as well as the number of neurons and the drop-out rate for each layer. Fine-tuning these hyperparameters within our model led to an increase in classification accuracy from 76% obtained with BertForSequenceClassification’s original model to a 97% accuracy obtained with ClaimsBERT.

show abstract

Section: Related Workmentioning

confidence: 99%

An Accuracy-Maximization Approach for Claims Classifiers in Document Content Analytics for Cybersecurity

Ameri

Hempel

Sharif

et al. 2022

JCP

View full text Add to dashboard Cite

show abstract

“…There have been plenty of researches on conventional neural network methods in the last few decades (Xu and Sarikaya 2013;Liu and Lane 2016;Haihong et al 2019;Wang et al 2020a;Gerz et al 2021). During recent years, with the rapid development of computing power, pre-trained models such as BERT (Devlin et al 2018) are employed for intent detection frequently (Castellucci et al 2019;He et al 2019;Zhang, Zhang, and Chen 2019;Athiwaratkun et al 2020;Gong et al 2021). Confidence Calibration.…”

Section: Related Workmentioning

confidence: 99%

Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss

Gong

Cao²,

Yang³

et al. 2022

AAAI

View full text Add to dashboard Cite

Data-driven methods have achieved notable performance on intent detection, which is a task to comprehend user queries. Nonetheless, they are controversial for over-confident predictions. In some scenarios, users do not only care about the accuracy but also the confidence of model. Unfortunately, mainstream neural networks are poorly calibrated, with a large gap between accuracy and confidence. To handle this problem defined as confidence calibration, we propose a model using the hyperspherical space and rebalanced accuracy-uncertainty loss. Specifically, we project the label vector onto hyperspherical space uniformly to generate a dense label representation matrix, which mitigates over-confident predictions due to overfitting sparse one-hot label matrix. Besides, we rebalance samples of different accuracy and uncertainty to better guide model training. Experiments on the open datasets verify that our model outperforms the existing calibration methods and achieves a significant improvement on the calibration metric.

show abstract

“…CNN is used instead of other typical deep neural networks such as LSTM [32], Bi-LSTM [33], and GRU [34] since it is currently the most successful model for addressing short text classification tasks [35]. The convolution and pooling techniques of CNN aid in the extraction of the main concepts and keywords of the text as features, resulting in a significant improvement in the performance of the classification model.…”

Section: Existing Hsd Modelsmentioning

confidence: 99%

Vietnamese Hate and Offensive Detection using PhoBERT-CNN and Social Media Streaming Data

Tran¹,

Nguyen²,

Hoang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Society needs to develop a system to detect hate and offense to build a healthy and safe environment. However, current research in this field still faces four major shortcomings, including deficient pre-processing techniques, indifference to data imbalance issues, modest performance models, and lacking practical applications. This paper focused on developing an intelligent system capable of addressing these shortcomings. Firstly, we proposed an efficient pre-processing technique to clean comments collected from Vietnamese social media. Secondly, a novel hate speech detection (HSD) model, which is the combination of a pretrained PhoBERT model and a Text-CNN model, was proposed for solving tasks in Vietnamese. Thirdly, EDA techniques are applied to deal with imbalanced data to improve the performance of classification models. Besides, various experiments were conducted as baselines to compare and investigate the proposed model's performance against state-of-the-art methods. The experiment results show that the proposed PhoBERT-CNN model outperforms SOTA methods and achieves an F1-score of 67,46% and 98,45% on two benchmark datasets, ViHSD and HSD-VLSP, respectively. Finally, we also built a streaming HSD application to demonstrate the practicality of our proposed system.

show abstract

Using Convolutional Neural Network with BERT for Intent Determination

Cited by 23 publications

References 12 publications

An Accuracy-Maximization Approach for Claims Classifiers in Document Content Analytics for Cybersecurity

An Accuracy-Maximization Approach for Claims Classifiers in Document Content Analytics for Cybersecurity

Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss

Vietnamese Hate and Offensive Detection using PhoBERT-CNN and Social Media Streaming Data

Contact Info

Product

Resources

About