A novel <scp>COVID</scp>‐19 sentiment analysis in Turkish based on the combination of convolutional neural network and bidirectional long‐short term memory on Twitter

Kabakuş, Abdullah Talha

doi:10.1002/cpe.6883

Cited by 10 publications

(11 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, our study differentiates with using different algorithms and datasets with different topics. On the other hand, another study [ 28 ] has very high accuracy over 15k tweets and CNN and bidirectional LSTM, which used a lexicon to label big number of data. As for us, in our study, we did not use lexicons; however, we used manually labeled dataset and SentimentSet that is created in the scope of the study.…”

Section: Discussionmentioning

confidence: 99%

“…Aydogan and Kocaman [ 26 ] offered a new dataset since there are limited Turkish datasets to work on. Lately, some COVID-19 related studies [ 28 – 31 ] can be found in the literature.…”

Section: Literature Reviewmentioning

confidence: 99%

“…Furthermore, while SentimentSet has limited amount of data, these custom datasets may contain more data in the future with applying lexicon-based approach to prevent manually labelling. In addition to data and preprocessing, studies [ 28 , 46 ] that are conducted with LSTM and CNN have higher accuracy results; these techniques provide good results on sentiment analysis in English language and may also be used for Turkish language studies to get higher accuracy results.…”

Section: Limitations and Future Researchmentioning

confidence: 99%

See 2 more Smart Citations

Sentimental Analysis of Twitter Users from Turkish Content with Natural Language Processing

Balli

Güzel

Bostancı

et al. 2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Artificial Intelligence has guided technological progress in recent years; it has shown significant development with increased academic studies on Machine Learning and the high demand for this field in the sector. In addition to the advancement of technology day by day, the pandemic, which has become a part of our lives since early 2020, has led to social media occupying a larger place in the lives of individuals. Therefore, social media posts have become an excellent data source for the field of sentiment analysis. The main contribution of this study is based on the Natural Language Processing method, which is one of the machine learning topics in the literature. Sentiment analysis classification is a solid example for machine learning tasks that belongs to human-machine interaction. It is essential to make the computer understand people emotional situation with classifiers. There are a limited number of Turkish language studies in the literature. Turkish language has different types of linguistic features from English. Since Turkish is an agglutinative language, it is challenging to make sentiment analysis with that language. This paper aims to perform sentiment analysis of several machine learning algorithms on Turkish language datasets that are collected from Twitter. In this research, besides using public dataset that belongs to Beyaz (2021) to get more general results, another dataset is created to understand the impact of the pandemic on people and to learn about public opinions. Therefore, a custom dataset, namely, SentimentSet (Balli 2021), was created, consisting of Turkish tweets that were filtered with words such as pandemic and corona by manually marking as positive, negative, or neutral. Besides, SentimentSet could be used in future researches as benchmark dataset. Results show classification accuracy of not only up to ∼87% with test data from datasets of both datasets and trained models, but also up to ∼84% with small “Sample Test Data” generated by the same methods as SentimentSet dataset. These research results contributed to indicating Turkish language specific sentiment analysis that is dependent on language specifications.

show abstract

Section: Discussionmentioning

confidence: 99%

“…Aydogan and Kocaman [ 26 ] offered a new dataset since there are limited Turkish datasets to work on. Lately, some COVID-19 related studies [ 28 – 31 ] can be found in the literature.…”

Section: Literature Reviewmentioning

confidence: 99%

Section: Limitations and Future Researchmentioning

confidence: 99%

See 1 more Smart Citation

Sentimental Analysis of Twitter Users from Turkish Content with Natural Language Processing

Balli

Güzel

Bostancı

et al. 2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

show abstract

“…A novel sentiment analysis model based on the combination of convolutional neural network and bidirectional long short-term memory was proposed in this study. 29 proposed deep neural network model using 15, 000 COVID-19 related Turkish tweets to classify into positive, negative, and neutral sentiment and obtained 97.9% accuracy. 30 identified trends, sentiment and emotions in nurses' COVID-19 related tweets from March to December 2020 using using AFINN, Bing, and NRC lexicon and 31 also performed analysis on Turkish nurses tweets to identify public perspective at the time of COVID-19 in Turkey.…”

Section: Related Workmentioning

confidence: 99%

Reliable Knowledge Discovery from Social Media Data in Health Crisis in Developing Countries

Naseem¹,

Hassan²,

Naseer³

et al. 2022

Preprint

View full text Add to dashboard Cite

Social media is a barometer to anticipate sentiment of the public about the state of affairs and ongoing pandemic engaged an additional user base who are confined to their stations. COVID-19 startled the world and the crisis exacerbates in the absence of sufficient data for policy making. The data from social media and a timely analysis can provide sufficient statistics for decision-making. This study explores Twitter data to discover knowledgeable statistics on public sentiments about COVID-19 vaccination in developing countries. The study inspects data collected from two extremely populated developing countries: India and Pakistan. Support Vector Machine (SVM) classifier achieves 74.3% accuracy on the manually labeled dataset. Furthermore, the sentiment analysis is correlated with other indigenous factors like regional literacy rate and COVID-19 calamities in the time interval. It is observed that the negative to positive sentiments correlates with a lower to higher regional literacy rate and a higher COVID-19 intensity causes positive sentiments towards vaccination. The correlations of results with indigenous factors may help to advocate the devised strategies to the right audience and social media knowledge discovery with machine learning techniques may help to recover from data scarcity challenges in a medical emergency like COVID-19 in developing countries. Please note: Abbreviations should be introduced at the first mention in the main text – no abbreviations lists. Suggested structure of main text (not enforced) is provided below.

show abstract

“…Automatic text annotations can detect hate speech by applying machine learning methods with a semi-supervised learning approach [4,5]. Hate speech data are annotated using two categories (hate and not hate) [6][7][8][9][10], and using sentiment analysis methods, in which data are labeled using two or three categories, namely (positive and negative) [11,12], or (positive, negative, and neutral) [13][14][15][16]. We develop automatic annotations by utilizing a dataset with minimal labeled training data and incorporate self-learning for labels.…”

Section: Introductionmentioning

confidence: 99%

Automated Text Annotation Using a Semi-Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection

Saifullah,

Dreżewski,

Dwiyanto

et al. 2024

Applied Sciences

View full text Add to dashboard Cite

Text annotation is an essential element of the natural language processing approaches. The manual annotation process performed by humans has various drawbacks, such as subjectivity, slowness, fatigue, and possibly carelessness. In addition, annotators may annotate ambiguous data. Therefore, we have developed the concept of automated annotation to get the best annotations using several machine-learning approaches. The proposed approach is based on an ensemble algorithm of meta-learners and meta-vectorizer techniques. The approach employs a semi-supervised learning technique for automated annotation to detect hate speech. This involves leveraging various machine learning algorithms, including Support Vector Machine (SVM), Decision Tree (DT), K-Nearest Neighbors (KNN), and Naive Bayes (NB), in conjunction with Word2Vec and TF-IDF text extraction methods. The annotation process is performed using 13,169 Indonesian YouTube comments data. The proposed model used a Stemming approach using data from Sastrawi and new data of 2245 words. Semi-supervised learning uses 5%, 10%, and 20% of labeled data compared to performing labeling based on 80% of the datasets. In semi-supervised learning, the model learns from the labeled data, which provides explicit information, and the unlabeled data, which offers implicit insights. This hybrid approach enables the model to generalize and make informed predictions even when limited labeled data is available (based on self-learning). Ultimately, this enhances its ability to handle real-world scenarios with scarce annotated information. In addition, the proposed method uses a variety of thresholds for matching words labeled with hate speech ranging from 0.6, 0.7, 0.8, to 0.9. The experiments indicated that the DT-TF-IDF model has the best accuracy value of 97.1% with a scenario of 5%:80%:0.9. However, several other methods have accuracy above 90%, such as SVM (TF-IDF and Word2Vec) and KNN (Word2Vec), based on both text extraction methods in several test scenarios.

show abstract

A novel COVID‐19 sentiment analysis in Turkish based on the combination of convolutional neural network and bidirectional long‐short term memory on Twitter

Cited by 10 publications

References 44 publications

Sentimental Analysis of Twitter Users from Turkish Content with Natural Language Processing

Sentimental Analysis of Twitter Users from Turkish Content with Natural Language Processing

Reliable Knowledge Discovery from Social Media Data in Health Crisis in Developing Countries

Automated Text Annotation Using a Semi-Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection

Contact Info

Product

Resources

About