2001
DOI: 10.1017/s1351324901002728
|View full text |Cite
|
Sign up to set email alerts
|

Applied morphological processing of English

Abstract: We describe two newly developed computational tools for morphological processing: a program for analysis of English inflectional morphology, and a morphological generator, automatically derived from the analyser. The tools are fast, being based on finite-state techniques, have wide coverage, incorporating data from various corpora and machine readable dictionaries, and are robust, in that they are able to deal effectively with unknown words. The tools are freely available. We evaluate the accuracy and spe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
133
0
1

Year Published

2005
2005
2016
2016

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 214 publications
(135 citation statements)
references
References 23 publications
1
133
0
1
Order By: Relevance
“…It is the last component in the C&C tools pipeline (Curran et al, 2007), comprising a tokenizer (Evang et al, 2013), POStagger, lemmatizer (Minnen et al, 2001), and a robust parser for CCG, Combinatory Categorial Grammar (Steedman, 2001). Overall, this parsing framework shows many points of contact with the recent work by Artzi et al (2015), who also use CCG coupled with a formal compositional semantics.…”
Section: Boxermentioning
confidence: 99%
“…It is the last component in the C&C tools pipeline (Curran et al, 2007), comprising a tokenizer (Evang et al, 2013), POStagger, lemmatizer (Minnen et al, 2001), and a robust parser for CCG, Combinatory Categorial Grammar (Steedman, 2001). Overall, this parsing framework shows many points of contact with the recent work by Artzi et al (2015), who also use CCG coupled with a formal compositional semantics.…”
Section: Boxermentioning
confidence: 99%
“…There are a total of 5633 written, non-fictional texts with an average of 355 words per text and an average of approximately 20 words per sentence. For the purposes of this analysis, the financial corpus was transformed into a frequency list identical in format to the WFWSE BNC list, including lemmas identified using the morpha tool from the University of Sussex (Minnen et al 2001) and part-of-speech tags derived using the LT-POS tagger from the University of Edinburgh. This lemmatised and tagged corpus is compared with the BNC and its sub-corpora of imaginative and informative texts.…”
Section: Language Varieties: Fiction Vs Non-fictionmentioning
confidence: 99%
“…-The Charniak parser [9] and the morpha lemmatiser [10] to carry out the syntactic and morphological analysis. -WordNet 2.0 [11] to extract both the verbs in entailment, Ent set, and the derivationally related words, Der set.…”
Section: Experimental Settingsmentioning
confidence: 99%