Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval 2017
DOI: 10.1145/3077136.3080834
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning for Extreme Multi-label Text Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
374
1
1

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 528 publications
(376 citation statements)
references
References 22 publications
0
374
1
1
Order By: Relevance
“…• Sequence, Graph and N-gram based models These types of models first transform the text dataset into sequences of words, the graph of words or N-grams features, later apply different types of deep learning models on those features including CNN (Kim, 2014b), CNN-RNN (Chen et al, 2017), RCNN (Lai et al, 2015), DCNN (Schwenk et al, 2017), XML-CNN (Liu et al, 2017), HR-DGCNN (Peng et al, 2018), Hierarchical LSTM (HLSTM) (Chen et al, 2016), multi-label classification approach based on a conditional cyclic directed graphical model (CDN-SVM) (Guo and Gu, 2011), Hierarchical Attention Network (HAN) (Yang et al, 2016) and Bi-directional Block Self-Attention Network (Bi-BloSAN) (Shen et al, 2018) etc. for the multilabel classification task For example, Hierarchical Attention Networks for Document Classification (HAN) uses a GRU grating mechanism to encode the sequences and apply word and sentence level attention on those sequences for document classification.…”
Section: Comparison Of Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…• Sequence, Graph and N-gram based models These types of models first transform the text dataset into sequences of words, the graph of words or N-grams features, later apply different types of deep learning models on those features including CNN (Kim, 2014b), CNN-RNN (Chen et al, 2017), RCNN (Lai et al, 2015), DCNN (Schwenk et al, 2017), XML-CNN (Liu et al, 2017), HR-DGCNN (Peng et al, 2018), Hierarchical LSTM (HLSTM) (Chen et al, 2016), multi-label classification approach based on a conditional cyclic directed graphical model (CDN-SVM) (Guo and Gu, 2011), Hierarchical Attention Network (HAN) (Yang et al, 2016) and Bi-directional Block Self-Attention Network (Bi-BloSAN) (Shen et al, 2018) etc. for the multilabel classification task For example, Hierarchical Attention Networks for Document Classification (HAN) uses a GRU grating mechanism to encode the sequences and apply word and sentence level attention on those sequences for document classification.…”
Section: Comparison Of Methodsmentioning
confidence: 99%
“…Many CNN based model, RCNN (Lai et al, 2015), Ensemble method of CNN and RNN by Chen et al (2017), XML-CNN (Liu et al, 2017), CNN (Kim, 2014a) and TEXTCNN (Kim, 2014a) have been proposed to solve the MLTC task. However, they neglect the correlations between labels.…”
Section: Neural Network Modelsmentioning
confidence: 99%
“…An intuitively reasonable objective for multilabel classification is rank loss [50], which minimizes the number of mis-ordered pairs of relevant and irrelevant labels. There is a propensity that we aim to tag relevant labels with high scores than irrelevant labels [23]. However, in feedforward neural network architecture, the rank loss has shown to be inferior to binary cross-entropy loss over sigmoid activation when applied to multi-label classification datasets [32], especially the datasets in the textual domain.…”
Section: Hiepar: Hierarchical and Transparent Representationmentioning
confidence: 99%
“…However, this mechanism could be problematic when the utterance is long. We experimented with Dynamic k-Max Pooling [12] to pool the most powerful features from p sub-sequences of an utterance with m words. This pooling scheme naturally deals with variable utterance length.…”
Section: Dynamic K-max Poolingmentioning
confidence: 99%