2018
DOI: 10.1007/978-3-319-75477-2_29
|View full text |Cite
|
Sign up to set email alerts
|

Gut, Besser, Chunker – Selecting the Best Models for Text Chunking with Voting

Abstract: The CoNLL-2000 dataset is the de-facto standard dataset for measuring chunkers on the task of chunking base noun phrases (NP) and arbitrary phrases. The state-of-the-art tagging method is utilising TnT, an HMM-based Part-of-Speech tagger (POS), with simple majority voting on different representations and fine-grained classes created by lexcialising tags. In this paper the state-of-the-art English phrase chunking method was deeply investigated, re-implemented and evaluated with several modifications. We also in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
6
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(6 citation statements)
references
References 8 publications
0
6
0
Order By: Relevance
“…The lexicalization invented by Molina and Pla [4] is thoroughly investigated by Indig and Endrédy [2] and they present a mildly lexicalized variant (see Table 2) of the method that has superior performance.…”
Section: Lexicalizationmentioning
confidence: 99%
See 4 more Smart Citations
“…The lexicalization invented by Molina and Pla [4] is thoroughly investigated by Indig and Endrédy [2] and they present a mildly lexicalized variant (see Table 2) of the method that has superior performance.…”
Section: Lexicalizationmentioning
confidence: 99%
“…The latter problem can be overcome by using a proper IOB converter which can also fix well-formedness issues (see Section 2 for details). The main motivation of the approach of Indig and Endrédy [2] was to reduce the number of labels because for agglutinative languages the original method is not feasible due to the high number of tags even without any lexicalization. But we also remark that this problem also exists with low thresholds used during lexicalization.…”
Section: Lexicalizationmentioning
confidence: 99%
See 3 more Smart Citations