2018 IEEE International Conference on Data Mining (ICDM) 2018
DOI: 10.1109/icdm.2018.00169
|View full text |Cite
|
Sign up to set email alerts
|

Doc2Cube: Allocating Documents to Text Cube Without Labeled Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
1
1

Relationship

3
3

Authors

Journals

citations
Cited by 20 publications
(21 citation statements)
references
References 12 publications
0
21
0
Order By: Relevance
“…WeSTClass (Meng et al, 2018(Meng et al, , 2019c) models class semantics as vMF distributions in the word embedding space and applies a pretrain-refine neural approach to perform text classification under weak supervision. Doc2Cube (Tao et al, 2018) leverages word-document cooccurrences to embed class labels, words and documents in the same space and perform classification by comparing embedding similarity. We adopt the two frameworks and replace the original embedding with the embedding trained by our Joint Skip-Gram and Joint CBOW models.…”
Section: Weakly-supervised Text Classificationmentioning
confidence: 99%
See 1 more Smart Citation
“…WeSTClass (Meng et al, 2018(Meng et al, , 2019c) models class semantics as vMF distributions in the word embedding space and applies a pretrain-refine neural approach to perform text classification under weak supervision. Doc2Cube (Tao et al, 2018) leverages word-document cooccurrences to embed class labels, words and documents in the same space and perform classification by comparing embedding similarity. We adopt the two frameworks and replace the original embedding with the embedding trained by our Joint Skip-Gram and Joint CBOW models.…”
Section: Weakly-supervised Text Classificationmentioning
confidence: 99%
“…Some studies along the embedding line learn word embeddings based on global contexts implicitly . HSMN (Huang et al, 2012 ), PTE (Tang et al, 2015 ), and Doc2Cube (Tao et al, 2018 ) take the average of word embedding in the document as the document representation and encourage similarity between word embedding and document embedding for co-occurred words and documents. However, these methods do not model global contexts explicitly because the document representations are essentially aggregated word representations and thus are not tailored for contextual representations.…”
Section: Introductionmentioning
confidence: 99%
“…A. Preliminaries 1) Data Cube Basics: Data cube is widely used to organize multi-dimensional data, such as records in relational databases and documents in text collections [13], [9], [11], [12]. With well-designed cube structures, it can largely boost various downstream data analytics, mining and summarization tasks [8].…”
Section: Problem Formulationmentioning
confidence: 99%
“…Following the design of [12], we can create three cube dimensions based on paper attributes in DBLP: L decade derived from the numerical attribute publication year, L venue derived from the categorical attribute publication venue, and L topic derived from textual attributes like paper titles and abstracts. We assign labels of the venue dimension by publication years; venue labels are assigned by conference or journal names like KDD, TKDE; topic labels are assigned according to a latent phrase-based topic model by labels like Neural networks, Feature selection.…”
Section: Problem Formulationmentioning
confidence: 99%
“…• Introduction: Motation & Overview • Phrase Mining [2,5,10,11,19] • Named Entity Recognition [3,6,9,12,15,18,20] • Taxonomy Construction [1, 7, 13, 16, 17, 23-25, 29, 30] • Mining Constructed Networks [4,8,14,21,[26][27][28] • System Demos [22,27] • Summary and Future Directions • Question Answering and Discussions…”
Section: Detailed Tutorial Outlinementioning
confidence: 99%