Learning efficient parsing

Noord, Gertjan van

doi:10.3115/1609067.1609158

Cited by 3 publications

(2 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With these options, the parser delivers a single parse, which it believes is the best parse according to a variety of heuristics. These include the disambiguation model and various optimizations of the parser presented in [9,16,17]. Furthermore, a timeout is enforced in order that the parser cannot spend more than 190 s on a single sentence.…”

Section: Estimation Of the Quality Of Lassy Largementioning

confidence: 99%

Large Scale Syntactic Annotation of Written Dutch: Lassy

Noord

Bouma

Eynde

et al. 2012

Essential Speech and Language Technology for Dutch

Self Cite

View full text Add to dashboard Cite

Section: Estimation Of the Quality Of Lassy Largementioning

confidence: 99%

Large Scale Syntactic Annotation of Written Dutch: Lassy

Noord

Bouma

Eynde

et al. 2012

Essential Speech and Language Technology for Dutch

Self Cite

View full text Add to dashboard Cite

“…Its grammar is designed following ideas of Head-driven Phrase Structure Grammar (Pollard and Sag, 1994), it uses a maximum-entropy model for statistical disambiguation, and coverage has been increased over the years by means of semiautomatic extension of the lexicon based on error-mining (van Noord, 2004). Efficiency is improved by using a part-of-speech tagger to filter out unlikely POS tags before parsing (Prins and van Noord, 2001), and by means of a technique which filters unlikely derivations based on statistics collected from automatically parsed corpora (van Noord, 2009). Alpino is a crucial component of Joost, an open-domain question-answering system for Dutch .…”

Section: Dependency Information For Question Answering and Relation Ementioning

confidence: 99%

Relation Extraction for Open and Closed Domain Question Answering

Bouma

Fahmi²,

Mur³

2011

Interactive Multi-Modal Question-Answering

View full text Add to dashboard Cite

One of the most accurate methods in Question Answering uses off-line information extraction to find answers for frequently asked questions. It requires automatic extraction from text of all relation instances for relations that users frequently ask for. In this chapter, we present two methods for learning relation instances for relations relevant in a closed and open domain (medical) question answering system. Both methods try to learn automatically dependency paths that typically connect two arguments of a given relation. The first (lightly supervised) method starts from a seed list of argument instances, and extracts dependency paths from all sentences in which a seed pair occurs. This method works well for large text collections and for seeds which are easily identified, such as named entities, and is well-suited for open domain question answering. In a second experiment, we concentrate on medical relation extraction for the question answering module of the IMIX system. The IMIX corpus is relatively small and relation instances may contain complex noun phrases that do not occur frequently in the exact same form in the corpus. In this case, learning from annotated data is necessary. We show that dependency patterns enriched with semantic concept labels give accurate results for relations that are relevant for a medical question answering system. Both methods improve the performance of the Dutch question answering system Joost.

show abstract

Introduction

Spyns

2012

Essential Speech and Language Technology for Dutch

View full text Add to dashboard Cite

The STEVIN ("STEVIN" is a Dutch acronym for "Essential Speech and Language Technology Resources for Dutch") programme aimed to contribute to the further progress of Human Language Technology for Dutch (HLTD) in the Low Countries (i.e., Flanders and the Netherlands) and to stimulate innovation in this sector. The major scientific goals were to set up an effective digital language infrastructure for Dutch, and to carry out strategic research in the field of language and speech technology for Dutch. 1 Consortia could submit project proposals in response to calls for proposals. Several calls were issued, and they included three open calls and two calls for tender as well. The thematic priorities for each call were determined in line with the overall STEVIN priorities and the state of their realisation before each call. The STEVIN thematic priorities, based on what is called the Basic Language Resource Kit (BLARK) for Dutch [20], are summarised in Tables 1.1 and 1.2. A BLARK is defined as the set of basic HLT resources that should be available for both academia and industry [13].STEVIN advocated an integrated approach: develop text and speech resources and tools, stimulate innovative strategic and application-oriented research, promote embedding of HLT in existing applications and services, stimulate HLT demand via 1 We refer the reader to Chap. 2 for more details. P. Spyns ( ) Nederlandse Taalunie,

show abstract

Learning efficient parsing

Cited by 3 publications

References 6 publications

Large Scale Syntactic Annotation of Written Dutch: Lassy

Large Scale Syntactic Annotation of Written Dutch: Lassy

Relation Extraction for Open and Closed Domain Question Answering

Introduction

Contact Info

Product

Resources

About