Graph- and surface-level sentence chunking

Muszynska, Ewa

doi:10.18653/v1/p16-3014

Cited by 5 publications

(7 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We might also introduce a heuristic to deal with long speechunits, which are particularly troublesome for existing parsers. One technique we can adopt is that of 'clause splitting', or 'chunking', which subdivides long strings for the purpose of higher quality analysis over small units (Tjong et al, 2001;Muszyńska, 2016). We hypothesise that such a step would play to the strength of existing parsers, namely their robustness over short inputs.…”

Section: Discussionmentioning

confidence: 99%

Parsing transcripts of speech

Caines

McCarthy

Buttery

2017

Proceedings of the Workshop on Speech-Centric Natural Language Processing

View full text Add to dashboard Cite

We present an analysis of parser performance on speech data, comparing word type and token frequency distributions with written data, and evaluating parse accuracy by length of input string. We find that parser performance tends to deteriorate with increasing length of string, more so for spoken than for written texts. We train an alternative parsing model with added speech data and demonstrate improvements in accuracy on speech-units, with no deterioration in performance on written text.

show abstract

Section: Discussionmentioning

confidence: 99%

Parsing transcripts of speech

Caines

McCarthy

Buttery

2017

Proceedings of the Workshop on Speech-Centric Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…Named Entity Recognition was initially performed using extensive knowledge base systems, its orthographic features, ontological and lexicon rules [4][2][14] [15]. However, the new trend has shifted towards neural network-based structures to define entity relations [5] Chunking has been done using machine learning-based models such as HMM(Hidden Markov Model) [7] [17] and Maximum Entropy model and has gradually seen a shift to Statistical models such as Support Vector Machines and Boosting [8], [3], [7]. In more recent times, Neural Models have been on a rise as a tool for chunking.…”

Section: Prior Workmentioning

confidence: 99%

“…Chunking is the process of splitting the words of a sentence into tokens and then grouping the tokens in a meaningful way. These chunks are our point of interest which are used to solve our relevant NLP tasks [3]. It labels every word of the sentence suitably and thus lays out a basic framework for bigger tasks such as question answering, information extraction, topic modeling, etc [16].…”

Section: Introductionmentioning

confidence: 99%

Ensemble Model for Chunking

Mohapatra¹,

Sarraf²,

Sahu³

2021

Computer Science &Amp; Information Technology (CS &Amp; IT)

View full text Add to dashboard Cite

Transformer Models have taken over most of the Natural language Inference tasks. In recent times they have proved to beat several benchmarks. Chunking means splitting the sentences into tokens and then grouping them in a meaningful way. Chunking is a task that has gradually moved from POS tag-based statistical models to neural nets using Language models such as LSTM, Bidirectional LSTMs, attention models, etc. Deep neural net Models are deployed indirectly for classifying tokens as different tags defined under Named Recognition Tasks. Later these tags are used in conjunction with pointer frameworks for the final chunking task. In our paper, we propose an Ensemble Model using a fine-tuned Transformer Model and a recurrent neural network model together to predict tags and chunk substructures of a sentence. We analyzed the shortcomings of the transformer models in predicting different tags and then trained the BILSTM+CNN accordingly to compensate for the same.

show abstract

“…This is because of the presence of grammatical structures not covered by the chunking algorithm, which lead to incorrect subgraphs. Limitations of the chunking algorithm are discussed in detail elsewhere (Muszyńska, 2016).…”

Section: Coverage and Performancementioning

confidence: 99%

“…In this paper we propose chunking (Muszyńska, 2016) as a way to reduce memory and time cost of realization. The general idea of chunking is that strings and semantic representations can be divided into smaller parts which can be processed independently and then recombined without a loss of information.…”

Section: Introductionmentioning

confidence: 99%

Realization of long sentences using chunking

Muszynska¹,

Copestake

2017

Proceedings of the 10th International Conference on Natural Language Generation

Self Cite

View full text Add to dashboard Cite

We propose sentence chunking as a way to reduce the time and memory costs of realization of long sentences. During chunking we divide the semantic representation of a sentence into smaller components which can be processed and recombined without loss of information. Our meaning representation of choice is Dependency Minimal Recursion Semantics (DMRS). We show that realizing chunks of a sentence and combining the results of such realizations increases the coverage for long sentences, significantly reduces the resources required and does not affect the quality of the realization.

show abstract

Graph- and surface-level sentence chunking

Cited by 5 publications

References 8 publications

Parsing transcripts of speech

Parsing transcripts of speech

Ensemble Model for Chunking

Realization of long sentences using chunking

Contact Info

Product

Resources

About