Keyphrases are important phrases that represent the theme of a document. With the help of keyphrases people can quickly find useful information from massive data. Traditional statistic-based methods for keyphrase extraction only make use of the statistical features of the words and ignore the semantic relationship between words. Recently, the emerging methods based on deep neural network extract keyphrases by capturing the semantic contextual information without considering the statistical features. In this paper, we propose a new keyphrase extraction method based on the neural network architecture composing of deep and wide learning parts. In the deep learning part, BERT (Bidirectional Encoder Representation from Transformers) and Bi-LSTM (Bidirectional Long Short-Term Memory) models are used to capture the contextual semantic information from the word sequence while in the wide learning part several important statistical features are considered to jointly train the keyphrase extraction model. The experimental results on two public datasets show that the performance of our proposed model is better than eight commonly baseline keyphrase extraction methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.