2020
DOI: 10.15622/ia.2020.19.6.5
|View full text |Cite
|
Sign up to set email alerts
|

Vietnamese Text Classification Algorithm using Long Short Term Memory and Word2Vec

Abstract: In the context of the ongoing forth industrial revolution and fast computer science development the amount of textual information becomes huge. So, prior to applying the seemingly appropriate methodologies and techniques to the above data processing their nature and characteristics should be thoroughly analyzed and understood. At that, automatic text processing incorporated in the existing systems may facilitate many procedures. So far, text classification is one of the basic applications to natural language pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…LSTM with the Word2Vec model achieves an F1-score of 98.03% for word segmentation in the Arabic language (Almuhareb et al 2019 ). Neural network-based word embedding efficiently models a word and its context and has become one of the most widely used methods of word distribution representation (N.H. Phat and Anh 2020 )(Alharthi et al 2021 ).…”
Section: Review On Text Analytics Word Embedding Application and Deep...mentioning
confidence: 99%
See 1 more Smart Citation
“…LSTM with the Word2Vec model achieves an F1-score of 98.03% for word segmentation in the Arabic language (Almuhareb et al 2019 ). Neural network-based word embedding efficiently models a word and its context and has become one of the most widely used methods of word distribution representation (N.H. Phat and Anh 2020 )(Alharthi et al 2021 ).…”
Section: Review On Text Analytics Word Embedding Application and Deep...mentioning
confidence: 99%
“… Malla and Alphonse ( 2021 ) Twitter tweet analysis for the disease information collection COVID-19 labeled English dataset from Twitter Majority voting based ensemble deep learning model RoBERT, BERTweet, CT-BERT RoBERT achieves an accuracy of 90.30% 38. Phat and Anh ( 2020 ) Vietnamese text classification Vietnamese news articles LSTM, CNN, SVM, NB Word2Vec LSTM + Word2Vec achieves an F1-score of 95.74% 39. Grzeça et al ( 2020 ) Social networking site tweets analysis for identification of alcohol-related tweets Datasets DS1-Q1, Q2, Q3 SVM, XGBoost, CNN, BiLSTM DSWE(Drink2Vec), BERT CNN + Drink2Vec achieves an F1-score of 94.45% SANAD Single-label Arabic news articles datasets, NADiA News articles datasets in Arabic with multi-labels, HAN Hierarchical attention network, HDBSCAN Hierarchical Density-Based Spatial Clustering of Applications with Noise, LDA Logistic regression, linear discriminant analysis, QDA Quadratic discriminant analysis, NB Naïve Bayes, SVM Support vector machine, KNN k-nearest neighbor, DT Decision tree, RF Random forest, XGBoost MLP Multilayer perceptron, LIWC Linguistic Inquiry and Word Count features, NER Named entity recognition, PMMC Process model matching contest dataset, DLMF Digital Library of Mathematical Functions, GB Gradient Boosting, SGC Stochastic Gradient Descent, HAN Hierarchical attention network, DFFNN Deep feed-forward neural network.…”
Section: Appendix Amentioning
confidence: 99%