Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017
DOI: 10.18653/v1/d17-1076
|View full text |Cite
|
Sign up to set email alerts
|

Do LSTMs really work so well for PoS tagging? – A replication study

Abstract: A recent study by Plank et al. (2016) found that LSTM-based PoS taggers considerably improve over the current state-of-theart when evaluated on the corpora of the Universal Dependencies project that use a coarse-grained tagset. We replicate this study using a fresh collection of 27 corpora of 21 languages that are annotated with fine-grained tagsets of varying size. Our replication confirms the result in general, and we additionally find that the advantage of LSTMs is even bigger for larger tagsets. However, w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(7 citation statements)
references
References 17 publications
0
7
0
Order By: Relevance
“…Some works are based on linear statistic models, such as Conditional Random Fields (CRF) [13] and Hidden Markov [14]. These statistic models perform relatively well on the corpora tagged with a coarse-grained tagset, but they do not perform as well as the Bi-LSTM on the corpora tagged with a fine-grained tagset [15].…”
Section: Pos Taggingmentioning
confidence: 99%
“…Some works are based on linear statistic models, such as Conditional Random Fields (CRF) [13] and Hidden Markov [14]. These statistic models perform relatively well on the corpora tagged with a coarse-grained tagset, but they do not perform as well as the Bi-LSTM on the corpora tagged with a fine-grained tagset [15].…”
Section: Pos Taggingmentioning
confidence: 99%
“…Di↵erent tagging models have been compared in Plank et al (2016) for part-of-speech tagging, and in Horsmann and Zesch (2017) for finergrained tag sets that lie somewhere between part-of-speech tagging and full morphological tagging. These comparisons show (1) that neuralnetwork-based taggers generally surpass traditional methods, but only if the training corpora are large enough, and (2) that neural-network-based taggers are especially well suited to tagging problems with large tag sets.…”
Section: Morphosyntactic Taggingmentioning
confidence: 99%
“…We decided to focus on two approaches that have been shown to be successful in POS tagging: Conditional Random Fields (CRF) (Gahbiche-Braham et al, 2012) and Recurrent Neural Networks (RNN) (Shao et al, 2017). RNNs are considered state of the art, but it is well known that they work best when they have access to large amounts of training data (Horsmann and Zesch, 2017). CRFs may be more amenable to small training data sets, but they may not scale up to a large label set (Horsmann and Zesch, 2017).…”
Section: Choice Of Classifiermentioning
confidence: 99%
“…RNNs are considered state of the art, but it is well known that they work best when they have access to large amounts of training data (Horsmann and Zesch, 2017). CRFs may be more amenable to small training data sets, but they may not scale up to a large label set (Horsmann and Zesch, 2017). Additionally, neural models can be pretrained on additional data from other domains and then optimized on our small training set.…”
Section: Choice Of Classifiermentioning
confidence: 99%