“…Although we employ an encoder-decoder architecture in the predictor component of our summarization framework, the framework can be applied to all models of sentence extraction using distributed representation as inputs, including recently advanced other attention-based encoder-decoder networks (Wang et al, 2016;Yang et al, 2016) (Cheng and Lapata, 2016;Nallapati et al, 2017) argue that a stumbling block to applying neural network models to extractive summarization is the lack of training data and documents with sentences labeled as summary-worthy. To overcome this, several studies have used artificial reference summaries (Sun et al, 2005;Svore et al, 2007;Woodsend and Lapata, 2010;Cheng and Lapata, 2016) compiled by collecting documents and corresponding highlights from other sources. However, preparing such a parallel corpus often requires domain-specific or expert knowledge depending on the domain (Filippova et al, 2009;Parveen et al, 2016).…”