Proceedings - Natural Language Processing in a Deep Learning World 2019
DOI: 10.26615/978-954-452-056-4_133
|View full text |Cite
|
Sign up to set email alerts
|

Augmenting a BiLSTM Tagger with a Morphological Lexicon and a Lexical Category Identification Step

Abstract: Previous work on using BiLSTM models for PoS tagging has primarily focused on small tagsets. We evaluate BiLSTM models for tagging Icelandic, a morphologically rich language, using a relatively large tagset. Our baseline BiLSTM model achieves higher accuracy than any previously published tagger not taking advantage of a morphological lexicon. When we extend the model by incorporating such data, we outperform previous state-of-theart results by a significant margin. We also report on work in progress that attem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

4
4

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 8 publications
0
5
0
Order By: Relevance
“…incorrect spelling) and e (foreign) labels. In comparison, (Steingrímsson et al, 2019) achieve 94.04% accuracy.…”
Section: Part Of Speechmentioning
confidence: 94%
See 1 more Smart Citation
“…incorrect spelling) and e (foreign) labels. In comparison, (Steingrímsson et al, 2019) achieve 94.04% accuracy.…”
Section: Part Of Speechmentioning
confidence: 94%
“…A good deal of work has been done on NLP for Icelandic that concerns these benchmarks. PoS tagging is implemented using a rule-based approach in the IceNLP toolkit (Loftsson and Rögnvaldsson, 2007a), and using a Bi-LSTM model in the ABLTagger (Steingrímsson et al, 2019). Constituency parsing has been implemented using a hand-crafted context-free grammar in the Greynir package (Þorsteinsson et al, 2019), using finite-state transducers in IceParser (Loftsson and Rögnvaldsson, 2007), and using an mBERT model in (Arnardóttir and Ingason, 2020).…”
Section: Nlp For Icelandicmentioning
confidence: 99%
“…• PoS tagger: Before the LTPI started, the best performing PoS tagger for Icelandic was ABLTagger 0.9, a BiLSTM model implemented in DyNet, achieving an accuracy of 94.47% when evaluated on the MIM-GOLD corpus with the original tagset (Steingrímsson et al, 2019). During the LTPI, this tagger has been gradually improved.…”
Section: Support Toolsmentioning
confidence: 99%
“…Nefnir ) is a lemmatiser which uses suffix substitution rules, derived from the Database of Icelandic Morphology (Bjarnadóttir et al, 2019), giving results that outperform IceNLP. ABLTagger (Steingrímsson et al, 2019) is a PoS tagger that outperforms other taggers that have been trained for tagging Icelandic texts. Some of these tools give good results, but can be improved upon.…”
Section: Nlp Toolsmentioning
confidence: 99%
“…A number of PoS-taggers have been developed for Icelandic, with the best results achieved by a recent bidirectional LSTM tagging model (Steingrímsson et al, 2019). While developing PoS taggers for Icelandic further using state-of-the-art methods, we will also study and try to estimate how much accuracy can theoretically be reached in tagging a variety of Icelandic text styles, using the tag set chosen for the LT programme (see Section 5.1).…”
Section: Nlp Toolsmentioning
confidence: 99%

Language Technology Programme for Icelandic 2019-2023

Nikulásdóttir,
Guðnason,
Ingason
et al. 2020
Preprint
Self Cite