Proceedings of the Ninth International Workshop on Parsing Technology - Parsing '05 2005
DOI: 10.3115/1654494.1654513
|View full text |Cite
|
Sign up to set email alerts
|

Exploring features for identifying edited regions in disfluent sentences

Abstract: This paper describes our effort on the task of edited region identification for parsing disfluent sentences in the Switchboard corpus. We focus our attention on exploring feature spaces and selecting good features and start with analyzing the distributions of the edited regions and their components in the targeted corpus. We explore new feature spaces of a partof-speech (POS) hierarchy and relaxed for rough copy in the experiments. These steps result in an improvement of 43.98% percent relative error reduction… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2006
2006
2009
2009

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(14 citation statements)
references
References 12 publications
0
14
0
Order By: Relevance
“…The strengths of these simple tree-based tech-niques should be combinable with sophisticated string-based Liu, 2004;Zhang and Weng, 2005) approaches by applying the methods of Wieling et al (2005) for constraining parses by externally-suggested brackets.…”
Section: Resultsmentioning
confidence: 99%
“…The strengths of these simple tree-based tech-niques should be combinable with sophisticated string-based Liu, 2004;Zhang and Weng, 2005) approaches by applying the methods of Wieling et al (2005) for constraining parses by externally-suggested brackets.…”
Section: Resultsmentioning
confidence: 99%
“…The features used here are grouped according to variables, which define feature sub-spaces as in Charniak and Johnson (2001) and Zhang and Weng (2005). In this work, we use a total of 62 variables, which include 16 1 variables from Charniak and Johnson (2001) and Johnson and Charniak (2004), an additional 29 variables from Zhang and Weng (2005), 11 hierarchical POS tag variables, and 8 prosody variables (labels and their confidence scores). Furthermore, we explore 377 combinations of these 62 variables, which include 40 combinations from Zhang and Weng (2005).…”
Section: Edit Region Identification Taskmentioning
confidence: 99%
“…In this work, we use a total of 62 variables, which include 16 1 variables from Charniak and Johnson (2001) and Johnson and Charniak (2004), an additional 29 variables from Zhang and Weng (2005), 11 hierarchical POS tag variables, and 8 prosody variables (labels and their confidence scores). Furthermore, we explore 377 combinations of these 62 variables, which include 40 combinations from Zhang and Weng (2005). The complete list of the variables is given in Table 2, and the combinations used in the experiments are given in Table 3.…”
Section: Edit Region Identification Taskmentioning
confidence: 99%
See 1 more Smart Citation
“…Most state-ofthe-art methods for edit region detection such as Zhang and Weng, 2005;Liu et al, 2004;Honal and Schultz, 2005) model speech disfluencies as a noisy channel model. In a noisy channel model we assume that an unknown but fluent string F has passed through a disfluency-adding channel to produce the observed disfluent string D, and we then aim to recover the most likely input stringF , defined aŝ…”
Section: Related Workmentioning
confidence: 99%