2016
DOI: 10.1016/j.neucom.2015.11.028
|View full text |Cite
|
Sign up to set email alerts
|

A novel negative sampling based on TFIDF for learning word representation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 31 publications
(12 citation statements)
references
References 18 publications
0
12
0
Order By: Relevance
“…The titles of the papers that are published in SIGMOD, VLDB, SIGIR, CIKM, ICDE and EDBT conferences from 2004 to 2013 are selected. From this dataset, we choose the titles of 8,884 papers to test our algorithm and the comparisons, which contains 8,572 terms after removing the stop words, and the values of entries in term-document matrix is assigned by the TF*IDF model [ 53 , 54 ].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The titles of the papers that are published in SIGMOD, VLDB, SIGIR, CIKM, ICDE and EDBT conferences from 2004 to 2013 are selected. From this dataset, we choose the titles of 8,884 papers to test our algorithm and the comparisons, which contains 8,572 terms after removing the stop words, and the values of entries in term-document matrix is assigned by the TF*IDF model [ 53 , 54 ].…”
Section: Resultsmentioning
confidence: 99%
“…For each document d i ∈ D , transform it into vector form V i ( v 1 , v 2 , …, v M ), where v j refers to as described in Definition 1, that is computed by counting the number of times that term t j occurs in document d i . Precisely, v j is usually defined by the normalized TF*IDF (term frequency inverse * document frequency) model [53, 54] that is widely used for measuring the term weights in a document set [55, 56]. Specifically, the entry is assigned as the TF*IDF of term t i that occurs in document d j .…”
Section: Methodsmentioning
confidence: 99%
“…Hierarchical softmax [44], which builds a Huffman tree, obtains a speedup of log 2 T . Negative sampling [11], [45] is an approximation of the softmax, which chooses to sample several negative examples instead of iterating over the entire vocabulary.…”
Section: A Word2vecmentioning
confidence: 99%
“…It exploited the label information, label relationship, data distribution, as well as correlation among different kinds of features simultaneously. More review can be found in [12][13][14].…”
Section: Introductionmentioning
confidence: 99%
“…As such, there are primarily two classes of sample reduction methods: sampling and vector quantization [12][13][14][15]. Sampling is a method that is generally used to quickly reduce data in an investigated dataset; it is popular due to its simplicity and easy execution, as random sampling (RS) [16].…”
Section: Introductionmentioning
confidence: 99%