2015
DOI: 10.1007/978-3-319-18038-0_23
|View full text |Cite
|
Sign up to set email alerts
|

Centroid-Means-Embedding: An Approach to Infusing Word Embeddings into Features for Text Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0
1

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 13 publications
0
4
0
1
Order By: Relevance
“…Studies to solve multi-class multi-label classification have been summarized in [18], in three smaller data sets with maximum labels of 27 in compare to current front-line of multi-label classification task. Sohrab [14,16] proposed a semantically augmented statistical vector space model (SAS-VSM) by introducing word embedding into feature for single-and multi-label text classification (TC). In this work, the SAS-VSM is introduced in FC and outperformed in compare to VSM.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Studies to solve multi-class multi-label classification have been summarized in [18], in three smaller data sets with maximum labels of 27 in compare to current front-line of multi-label classification task. Sohrab [14,16] proposed a semantically augmented statistical vector space model (SAS-VSM) by introducing word embedding into feature for single-and multi-label text classification (TC). In this work, the SAS-VSM is introduced in FC and outperformed in compare to VSM.…”
Section: Related Workmentioning
confidence: 99%
“…where in (11)-(13), tf (t i , d) is the number of occurrences of term t i in document d, D denotes the total number of documents in the training corpus, #t i is the number of documents in the training corpus in which term t i occurs at least once, D/#t i is the inverse document frequency (IDF) of term t i , C denotes the total number of predefined categories in the training corpus, c(t i ) is the number of categories in the training corpus in which term t i occurs at least once, and C CS δ (ti) is the inverse class space density frequency (ICS δ F) of term t i . Please refer to [13,14] for more details.…”
Section: Base Algorithmsmentioning
confidence: 99%
See 1 more Smart Citation
“…where C denotes the total number of predefined categories in the training corpus, c(t i ) is the number of categories in the training corpus in which term t i occurs at least once, C c(ti) is the ICF of the term t i , and C CS δ (ti) is the (ICS δ F) of term t i . Please refer to [14], [16] for more details.…”
Section: B Term Weighting Approachesmentioning
confidence: 99%
“…Abordagens de representações de texto baseadas em word embeddings foram extensivamente estudadas na área de Mineração de Textos (SOHRAB; MIWA; SASAKI, 2015;WANG et al, 2016;REZENDE, 2017), destacando as abordagens tradicionais porque representam cada palavra por um vetor único capaz de agregar um grande volume de informações latentes (COLLOBERT; WESTON, 2008). Basicamente, os modelos de incorporação de palavras permitem descobrir a ocorrência de palavras de acordo com o contexto em Considerando que a maioria dos métodos tradicionais de representação de texto geralmente trata documentos como combinações de termos independentes, a utilização de técnicas de word embeddings possibilita o enriquecimento semântico das representações de texto.…”
Section: Técnica De Extração De Contexto Baseada Em Word Embeddings (unclassified