Given the large amounts of online textual documents available these days, e.g., news articles, weblogs, and scientific papers, effective methods for extracting keyphrases, which provide a high-level topic description of a document, are greatly needed. In this paper, we propose a supervised model for keyphrase extraction from research papers, which are embedded in citation networks. To this end, we design novel features based on citation network information and use them in conjunction with traditional features for keyphrase extraction to obtain remarkable improvements in performance over strong baselines.
With the exponential growth of scholarly data during the past few years, effective methods for topic classification are greatly needed. Current approaches usually require large amounts of expensive labeled data in order to make accurate predictions. In this paper, we posit that, in addition to a research article's textual content, its citation network also contains valuable information. We describe a co-training approach that uses the text and citation information of a research article as two different views to predict the topic of an article. We show that this method improves significantly over the individual classifiers, while also bringing a substantial reduction in the amount of labeled data required for training accurate classifiers.
Keyphrases for a document provide a high-level topic description of the document. Given the number of documents growing exponentially on the Web in the past years, accurate methods for extracting keyphrases from such documents are greatly needed. In this study, we provide a comparison of existing supervised approaches to this task to determine the current best performing model. We use research articles on the Web as the case study.
To have a more meaningful impact, educational applications need to significantly improve the way feedback is offered to teachers and students. We propose two methods for determining propositional-level entailment relations between a reference answer and a student's response. Both methods, one using hand-crafted features and an SVM and the other using word embeddings and deep neural networks, achieve significant improvements over a state-of-the-art system and two alternative approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.