2017
DOI: 10.3390/app7080846
|View full text |Cite
|
Sign up to set email alerts
|

Learning Word Embeddings with Chi-Square Weights for Healthcare Tweet Classification

Abstract: Twitter is a popular source for the monitoring of healthcare information and public disease. However, there exists much noise in the tweets. Even though appropriate keywords appear in the tweets, they do not guarantee the identification of a truly health-related tweet. Thus, the traditional keyword-based classification task is largely ineffective. Algorithms for word embeddings have proved to be useful in many natural language processing (NLP) tasks. We introduce two algorithms based on an existing word embedd… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(14 citation statements)
references
References 28 publications
0
11
0
Order By: Relevance
“…Distributed representations of words are capable of successfully capturing meaningful syntactic and semantic properties of the language and it has been shown [33] that using word embeddings as features could improve many NLP tasks, such as information retrieval [34,35], part-of-speech tagging [36] or named entity recognition (NER) [37]; Kuang and Davidson [38] learned specific word embeddings from Twitter for classifying healthcare-related tweets. Since learning those word representations is a slow and non-trivial task, already trained models can be found in literature; state-of-the-art embeddings are mainly based on deep-learning [31,39], but other techniques have been previously explored, for instance spectral methods [40,41].…”
Section: Related Workmentioning
confidence: 99%
“…Distributed representations of words are capable of successfully capturing meaningful syntactic and semantic properties of the language and it has been shown [33] that using word embeddings as features could improve many NLP tasks, such as information retrieval [34,35], part-of-speech tagging [36] or named entity recognition (NER) [37]; Kuang and Davidson [38] learned specific word embeddings from Twitter for classifying healthcare-related tweets. Since learning those word representations is a slow and non-trivial task, already trained models can be found in literature; state-of-the-art embeddings are mainly based on deep-learning [31,39], but other techniques have been previously explored, for instance spectral methods [40,41].…”
Section: Related Workmentioning
confidence: 99%
“…Xu et al [23] designed a document classification framework based on word embedding and conducted a series of experiments on a biomedical documents classification task, which leveraged the semantic features generated by the word embedding approach, achieving highly competitive results. Kuang [15] proposed two algorithms based on the CBOW model and evaluated word embeddings learned from these proposed algorithms for two healthcare-related datasets. The results showed that the proposed algorithms improved accuracy by more than 9% compared to existing techniques.…”
Section: Feature Extraction From Textmentioning
confidence: 99%
“…The CBOW algorithm is capable of learning the contexts of words and is commonly applied to text classifiers, as [30] used it for classifying healthcare tweets.…”
Section: Encoding-based Wave2vec Time Series Classifiermentioning
confidence: 99%