Proceedings of the Workshop on Speech-Centric Natural Language Processing 2017
DOI: 10.18653/v1/w17-4604
|View full text |Cite
|
Sign up to set email alerts
|

Parsing transcripts of speech

Abstract: We present an analysis of parser performance on speech data, comparing word type and token frequency distributions with written data, and evaluating parse accuracy by length of input string. We find that parser performance tends to deteriorate with increasing length of string, more so for spoken than for written texts. We train an alternative parsing model with added speech data and demonstrate improvements in accuracy on speech-units, with no deterioration in performance on written text.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 23 publications
(20 reference statements)
0
5
0
Order By: Relevance
“…In addition, no significant improvement is gained if the written data is modified so as to exclude punctuation (ssj no-punct) or perform lowercasing (ssj lc), which even worsens the results. Somewhat surprisingly, no definite conclusion can be drawn on the joint training model based on both spoken and written data (sst+ssj), as the parsers give significantly different results: while Stanford parser substantially outperforms the baseline result when adding written data to the model (similar to the findings by Caines et al (2017)), this addition has a negative affect on UD-Pipe. This could be explained by the fact that global, exhaustive, graph-based parsing systems are more capable of leveraging the richer contextual information gained with a larger train set in comparison with local, greedy, transition-based systems (McDonald and Nivre, 2007).…”
Section: Modifications Of Ud Annotationmentioning
confidence: 97%
See 1 more Smart Citation
“…In addition, no significant improvement is gained if the written data is modified so as to exclude punctuation (ssj no-punct) or perform lowercasing (ssj lc), which even worsens the results. Somewhat surprisingly, no definite conclusion can be drawn on the joint training model based on both spoken and written data (sst+ssj), as the parsers give significantly different results: while Stanford parser substantially outperforms the baseline result when adding written data to the model (similar to the findings by Caines et al (2017)), this addition has a negative affect on UD-Pipe. This could be explained by the fact that global, exhaustive, graph-based parsing systems are more capable of leveraging the richer contextual information gained with a larger train set in comparison with local, greedy, transition-based systems (McDonald and Nivre, 2007).…”
Section: Modifications Of Ud Annotationmentioning
confidence: 97%
“…Nevertheless, apart from research on speechspecific parsing systems, very little research has been dedicated to other, data-related aspects of spoken language parsing. To our knowledge, with expection of Caines et al (2017) and Nasr et al (2014), who investigate the role of different types of training data used for parsing transcripts of speech, there have been no other systematic studies on the role of spoken data representations, such as transcription or annotation conventions, in spoken language parsing.…”
Section: Related Workmentioning
confidence: 99%
“…This poses a particular challenge as most models used in data pre-processing and representation learning have been trained on written not spoken texts (Caines et al, 2017). Furthermore, most existing approaches to speech grading do have access to audio features, and indeed extract a large number of prosodic or duration-based features (Zechner et al, 2009;Higgins et al, 2011;Loukina et al, 2017;Wang et al, 2018a).…”
Section: Related Workmentioning
confidence: 99%
“…Unlike written discourse, speech is full of disfluencies which make discovering the underlying syntactic structure challenging, as such disfluencies interrupt the syntactic structure of the utterance Caines et al, 2017). For example, according to Meteer and Taylor (1995), 17% of tokens in the Switchboard telephone conversations are various disfluencies.…”
Section: Related Workmentioning
confidence: 99%