Paraphrase generation is an important yet challenging task in natural language processing. Neural network-based approaches have achieved remarkable success in sequence-to-sequence learning. Previous paraphrase generation work generally ignores syntactic information regardless of its availability, with the assumption that neural nets could learn such linguistic knowledge implicitly. In this work, we make an endeavor to probe into the efficacy of explicit syntactic information for the task of paraphrase generation. Syntactic information can appear in the form of dependency trees, which could be easily acquired from off-the-shelf syntactic parsers. Such tree structures could be conveniently encoded via graph convolutional networks to obtain more meaningful sentence representations, which could improve generated paraphrases. Through extensive experiments on four paraphrase datasets with different sizes and genres, we demonstrate the utility of syntactic information in neural paraphrase generation under the framework of sequence-to-sequence modeling. Specifically, our graph convolutional network-enhanced models consistently outperform their syntax-agnostic counterparts using multiple evaluation metrics.
Paraphrase generation is an important yet challenging task in NLP. Neural network-based approaches have achieved remarkable success in sequence-to-sequence(seq2seq) learning. Previous paraphrase generation work generally ignores syntactic information regardless of its availability, with the assumption that neural nets could learn such linguistic knowledge implicitly. In this work we make an endeavor to probe into the efficacy of explicit syntactic information for the task of paraphrase generation. Syntactic information can appear in the form of dependency trees which could be easily acquired from off-the-shelf syntactic parsers. Such tree structures could be conveniently encoded via graph convolutional networks(GCNs) to obtain more meaningful sentence representations, which could improve generated paraphrases. Through extensive experiments on four paraphrase datasets with different sizes and genres, we demonstrate the utility of syntactic information in neural paraphrase generation under the framework of seq2seq modeling. Specifically, our GCN-enhanced models consistently outperform their syntax-agnostic counterparts in multiple evaluation metrics.
Paraphrase generation is an essential yet challenging task in natural language processing. Neural-network-based approaches towards paraphrase generation have achieved remarkable success in recent years. Previous neural paraphrase generation approaches ignore linguistic knowledge, such as part-of-speech information regardless of its availability. The underlying assumption is that neural nets could learn such information implicitly when given sufficient data. However, it would be difficult for neural nets to learn such information properly when data are scarce. In this work, we endeavor to probe into the efficacy of explicit part-of-speech information for the task of paraphrase generation in low-resource scenarios. To this end, we devise three mechanisms to fuse part-of-speech information under the framework of sequence-to-sequence learning. We demonstrate the utility of part-of-speech information in low-resource paraphrase generation through extensive experiments on multiple datasets of varying sizes and genres.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.