Citation count prediction (CCP) has been an important research task for automatically estimating the future impact of a scholarly paper. Previous studies mainly focus on extracting or mining useful features from the paper itself or the associated authors. An important kind of data signals, i.e., peer review text, has not been utilized for the CCP task. In this paper, we take the initiative to utilize peer review data for the CCP task with a neural prediction model. Our focus is to learn a comprehensive semantic representation for peer review text for improving the prediction performance. To achieve this goal, we incorporate the abstractreview match mechanism and the cross-review match mechanism to learn deep features from peer review text. We also consider integrating hand-crafted features via a wide component. The deep and wide components jointly make the prediction. Extensive experiments have demonstrated the usefulness of the peer review data and the effectiveness of the proposed model. Our dataset has been released online.
This article presents a network embedding approach to automatically generate tags for microblog users. Instead of using text data, we aim to annotate microblog users with meaningful tags by leveraging rich social link data. To utilize directed social links, we use two kinds of node representations for modeling user interest in terms of their followers and followees, respectively. To alleviate the sparsity problem, we propose a novel method based on two transformation functions for capturing implicit interest similarity. Different from previous works on capturing high-order proximity, our model is able to directly characterize the effect of the context user on the proximity of node pairs. Another novelty of our model is that the importance scores of users learned from the classic PageRank algorithm are utilized to set the link weights. By using such weights, our model is more capable of disentangling the interest similarity evidence of a link. We jointly consider the above factors when designing the final objective function. We construct a very large evaluation set consisting of 2.6M users, 0.5M tags, and 0.8B following links. To our knowledge, it is the largest reported dataset for microblog user tagging in the literature. Extensive experiments on this dataset demonstrate the effectiveness of the proposed approach. We implement this approach with several optimization techniques, which makes our model easy to scale to very large social networks. Ubiquitous social links provide important data resources to understand user interests. Our work provides an effective and efficient solution to annotate user interests solely using the link data, which has important practical value in industry. To illustrate the use of our models, we implement a demonstration system for visualizing, navigating, and searching microblog users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.