2021
DOI: 10.1007/s42979-021-00807-1
|View full text |Cite
|
Sign up to set email alerts
|

Specialists, Scientists, and Sentiments: Word2Vec and Doc2Vec in Analysis of Scientific and Medical Texts

Abstract: Analyze performance of unsupervised embedding algorithms in sentiment analysis of knowledge-rich data sets. We apply state-of-the-art embedding algorithms Word2Vec and Doc2Vec as the learning techniques. The algorithms build word and document embeddings in an unsupervised manner. To assess the algorithms’ performance, we define sentiment metrics and use a semantic lexicon SentiWordNet (SWN) to establish the benchmark measures. Our empirical results are obtained on the Obesity data set from i2b2 clinical discha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 36 publications
1
9
0
Order By: Relevance
“…However, the performance of doc2vec on our medical chats dataset was significantly lower than that of word2vec. Previous studies also reported similar results, demonstrating the better performance of word2vec over doc2vec [31][32][33][34] Accordingly, we proceed with the weighted word2vec embeddings in our numerical study. For XGBoost, while we include the message length in the triggering phase, we exclude it in the response generation phase.…”
Section: Machine Learning Modelssupporting
confidence: 58%
“…However, the performance of doc2vec on our medical chats dataset was significantly lower than that of word2vec. Previous studies also reported similar results, demonstrating the better performance of word2vec over doc2vec [31][32][33][34] Accordingly, we proceed with the weighted word2vec embeddings in our numerical study. For XGBoost, while we include the message length in the triggering phase, we exclude it in the response generation phase.…”
Section: Machine Learning Modelssupporting
confidence: 58%
“…Therefore, We use the most advanced embedding algorithms Doc2Vec as learning techniques. The algorithm builds word and document embeddings in an unsupervised manner (Chen & Sokolova, 2021 ).
Fig.
…”
Section: Methodsmentioning
confidence: 99%
“…This study uses the word2vec model and the glove model, two of the most popular algorithms for word embeddings. The first model is the Word2Vec Model was first introduced by [ 34 ], is popular and widely used in learning word embeddings from raw text. Based on the idea of distributed representation of words, word2vec (word embeddings) uses a shallow neural network to learn word embeddings and predict the relation between every word and its context words.…”
Section: Proposed Systemmentioning
confidence: 99%
“…In word2vec, SG (skip-gram) and CBOW (Continuous Bag-of-Words) algorithms are used to produce word vectors [ 34 ]. The SG model is used to store semantic and syntactic information about sentences.…”
Section: Proposed Systemmentioning
confidence: 99%