Proceedings of the 2019 Conference of the North 2019
DOI: 10.18653/v1/n19-1408
|View full text |Cite
|
Sign up to set email alerts
|

Rethinking Complex Neural Network Architectures for Document Classification

Abstract: Neural network models for many NLP tasks have grown increasingly complex in recent years, making training and deployment more difficult. A number of recent papers have questioned the necessity of such architectures and found that well-executed, simpler models are quite effective. We show that this is also the case for document classification: in a large-scale reproducibility study of several recent neural models, we find that a simple BiLSTM architecture with appropriate regularization yields accuracy and F 1 … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
57
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 85 publications
(57 citation statements)
references
References 16 publications
0
57
0
Order By: Relevance
“…For our experiments, we used the IMDB dataset (135,669 documents) [ 28 ], the Yelp-hotel dataset (34,961 documents) [ 29 ], the Yelp-rest dataset (178,239 documents) [ 29 ], and the Amazon dataset (83,159 documents) [ 29 ]. The IMDB dataset is a movie review dataset annotated with 10-scale polarities.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…For our experiments, we used the IMDB dataset (135,669 documents) [ 28 ], the Yelp-hotel dataset (34,961 documents) [ 29 ], the Yelp-rest dataset (178,239 documents) [ 29 ], and the Amazon dataset (83,159 documents) [ 29 ]. The IMDB dataset is a movie review dataset annotated with 10-scale polarities.…”
Section: Resultsmentioning
confidence: 99%
“…In Table 4 , Kim-CNN [ 8 ] is a sentence classification model that shows good performances, although it uses simple CNNs. Adhikari-logistic regression [ 28 ] and Adhikari-support vector machine [ 28 ] are text classification models based on logistic regression and support vector machine, in which the term frequency and inversed document-frequency scores are used as features, respectively. HAN [ 32 ] extracts meaningful features by modeling the hierarchical structure of a document and classifies the document into predefined classes using two levels of attention mechanisms: Word-level attentions and sentence-level attentions.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…By nature, a movie should be tough to be cleanly categorized, due to its length, complex storyline and turns, and the lack of evaluative criteria. Prior works in document classification (Yang et al, 2016;Liu et al, 2017;Adhikari et al, 2019;Johnson and Zhang, 2015) evaluated on datasets with small document size (Reuters, IMDB, Yelp, etc.). However, our document size on average is at least 65 times longer, which may be challenging for NN-based models to train due to long sequences and the associated computational burden.…”
Section: Related Workmentioning
confidence: 99%