Proceedings of the 2018 SIAM International Conference on Data Mining 2018
DOI: 10.1137/1.9781611975321.5
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Neural Categorization for Scientific Publications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 15 publications
(10 citation statements)
references
References 0 publications
0
10
0
Order By: Relevance
“…The research on concept learning [19,20] focuses on obtaining and exploiting semantically meaningful information from noisy structured data under set expansion [21] or hierarchical clustering [22,23] paradigm, or from unstructured text data [24][25][26][27] by leveraging distributed semantics [28,29]. We build on top of previous work [1] and leverage the shared representation between concepts and tasks to improve downstream applications.…”
Section: Concept Learningmentioning
confidence: 99%
“…The research on concept learning [19,20] focuses on obtaining and exploiting semantically meaningful information from noisy structured data under set expansion [21] or hierarchical clustering [22,23] paradigm, or from unstructured text data [24][25][26][27] by leveraging distributed semantics [28,29]. We build on top of previous work [1] and leverage the shared representation between concepts and tasks to improve downstream applications.…”
Section: Concept Learningmentioning
confidence: 99%
“…The final document class is assigned based on the vector similarity between labels and documents. • UNEC [15]: this method takes label surface name as its weak supervision. It categorizes documents by learning the semantics and category attribution of concepts inside the corpus.…”
Section: Baselinesmentioning
confidence: 99%
“…Weak supervision has been studied for building document classifiers in various of forms, including hundreds of labeled training documents (Tang et al, 2015;Miyato et al, 2016;Xu et al, 2017), class/category names (Song and Roth, 2014;Tao et al, 2015;Li et al, 2018), and user-provided seed words Tao et al, 2015). In this paper, we focus on user-provided seed words as the source of weak supervision, Along this line, Doc2Cube (Tao et al, 2015) expands label keywords from label surface names and performs multidimensional document classification by learning dimension-aware embedding; PTE (Tang et al, 2015) utilizes both labeled and unlabeled documents to learn text embeddings specifically for a task, which are later fed to logistic regression classifiers for classification; leverage seed information to generate pseudo documents and introduces a self-training module that bootstraps on real unlabeled data for model refining.…”
Section: Weak Supervision For Text Classificationmentioning
confidence: 99%