Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2014
DOI: 10.3115/v1/p14-2043
|View full text |Cite
|
Sign up to set email alerts
|

Part-of-Speech Tagging using Conditional Random Fields: Exploiting Sub-Label Dependencies for Improved Accuracy

Abstract: We discuss part-of-speech (POS) tagging in presence of large, fine-grained label sets using conditional random fields (CRFs). We propose improving tagging accuracy by utilizing dependencies within sub-components of the fine-grained labels. These sub-label dependencies are incorporated into the CRF model via a (relatively) straightforward feature extraction scheme. Experiments on five languages show that the approach can yield significant improvement in tagging accuracy in case the labels have sufficiently rich… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(11 citation statements)
references
References 9 publications
0
11
0
Order By: Relevance
“…CRF has been used for many different tasks, especially dealing with sequence labeling such as POS tagging (Lafferty et al, 2001a;Silfverberg et al, 2014) and named entity recognition (McCallum and Li, 2003;Settles, 2004). Similar to us, three out of seven participating teams also used CRF for codeswitching detection for the EMNLP 2014 language identification shared task (Solorio et al, 2014).…”
Section: Related Workmentioning
confidence: 89%
“…CRF has been used for many different tasks, especially dealing with sequence labeling such as POS tagging (Lafferty et al, 2001a;Silfverberg et al, 2014) and named entity recognition (McCallum and Li, 2003;Settles, 2004). Similar to us, three out of seven participating teams also used CRF for codeswitching detection for the EMNLP 2014 language identification shared task (Solorio et al, 2014).…”
Section: Related Workmentioning
confidence: 89%
“…CRF has been used for many different tasks, especially dealing with sequence labeling such as POS tagging (Lafferty et al, 2001a;Silfverberg et al, 2014) and named entity recognition (McCallum and Li, 2003;Settles, 2004). Similar to us, three out of seven participating teams also used CRF for codeswitching detection for the EMNLP 2014 language identification shared task .…”
Section: Introductionmentioning
confidence: 90%
“…Maharjan et al (2015) collected codeswitched tweets for Spanish-English and Nepali-English language pairs. They first figured out some seed users who codeswitched frequently and then followed him/her to collect more codeswitched tweets.They obtained an accuracy of 86% and 87% for Spanish-English and Nepali-English dataset using CRF GE algorithm.CRF has been used for many different tasks, especially dealing with sequence labeling such as POS tagging (Lafferty et al, 2001a;Silfverberg et al, 2014) and named entity recognition (McCallum and Li, 2003;Settles, 2004). Similar to us, three out of seven participating teams also used CRF for codeswitching detection for the EMNLP 2014 language identification shared task .…”
mentioning
confidence: 90%
“…A typical example of such a structured morphological label is the label Noun|Sg|Nom, which consists of three sub units: the main word class Noun, the singular number Sg and the nominative case Nom. FinnPos utilizes the internal structure of complex labels by extracting features for sub-units as well as for the entire labels [19]. This alleviates the data sparsity problem because features relating to sub-units of entire tags are used as fall-back.…”
Section: Finnpos For Morphologically Rich Languagesmentioning
confidence: 99%