Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics - 1999
DOI: 10.3115/1034678.1034754
|View full text |Cite
|
Sign up to set email alerts
|

A statistical parser for Czech

Abstract: This paper considers statistical parsing of Czech, which differs radically from English in at least two respects: (1) it is a highly inflected language, and (2) it has relatively free word order. These differences are likely to pose new problems for techniques that have been developed on English. We describe our experience in building on the parsing model of (Collins 97). Our final results-80% dependency accuracy-represent good progress towards the 91% accuracy of the parser on English (Wall Street Journal) te… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
85
0
1

Year Published

2005
2005
2010
2010

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 101 publications
(87 citation statements)
references
References 8 publications
1
85
0
1
Order By: Relevance
“…These filtering criteria are discussed in more detail in the experimental sections. The remaining set of projected trees becomes the treebank that will be used to train a new dependency parser -we conduct our experiments using a version of the Collins parser that has been adapted for dependency treebanks (Collins et al 1999). Once trained, the new parser is ready to generate dependency analyses for unseen new sentences in that language.…”
Section: Our Projection Framework For Bootstrapping Parsersmentioning
confidence: 99%
“…These filtering criteria are discussed in more detail in the experimental sections. The remaining set of projected trees becomes the treebank that will be used to train a new dependency parser -we conduct our experiments using a version of the Collins parser that has been adapted for dependency treebanks (Collins et al 1999). Once trained, the new parser is ready to generate dependency analyses for unseen new sentences in that language.…”
Section: Our Projection Framework For Bootstrapping Parsersmentioning
confidence: 99%
“…First, Czech is a "highly inflected" language: the role of function words in the Germanic and Romance languages is typically filled by suffixes in Czech. Second, Czech exhibits a "relatively free word order" [7]. Since a great deal of the POS information exploited by an HMM tagger is contained in sequences of function words 12 , these features of Czech hinder the performance of an HMM POS tagger.…”
Section: Single-source Taggersmentioning
confidence: 99%
“…A Type-III tree will be built by using the part-of-speech of the visited node x as the root, connecting the produced sub-tree tmp to the root. If the child of the visited node y does not have any children, a Type-II tree will be built instead (line [20][21]. Figure 7 shows an example of DIG elementary trees extracted from the annotated-tree text "I ate boiled rice with my friend".…”
Section: Extracting Elementary Trees From the Treebankmentioning
confidence: 99%
“…Of course, a rich set of training data and accurate knowledge are crucial for this method. Various methods have been proposed for the learning part of this approach: learning actions of a deterministic parser [18], [19], learning similarity of tree structures [20], [21], and learning the scores of dependencies [22]- [24].…”
Section: Linesmentioning
confidence: 99%