This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
This paper measures social media activity of 15 broad scientific disciplines indexed in Scopus database using Altmetric.com data. First, the presence of Altmetric.com data in Scopus database is investigated, overall and across disciplines. Second, the correlation between the bibliometric and altmetric indices is examined using Spearman correlation. Third, a zero-truncated negative binomial model is used to determine the association of various factors with increasing or decreasing citations. Lastly, the effectiveness of altmetric indices to identify publications with high citation impact is comprehensively evaluated by deploying Area Under the Curve (AUC) -an application of receiver operating characteristic. Results indicate a rapid increase in the presence of Altmetric.com data in Scopus database from 10.19% in 2011 to 20.46% in 2015. A zero-truncated negative binomial model is implemented to measure the extent to which different bibliometric and altmetric factors contribute to citation counts. Blog count appears to be the most important factor increasing the number of citations by 38.6% in the field of Health Professions and Nursing, followed by Twitter count increasing the number of citations by 8% in the field of Physics and Astronomy. Interestingly, both Blog count and Twitter count always show positive increase in the number of citations across all fields. While there was a positive weak correlation between bibliometric and altmetric indices, the results show that altmetric indices can be a good indicator to discriminate highly cited publications, with an encouragingly AUC= 0.725 between highly cited publications and total altmetric count. Overall, findings suggest that altmetrics could better distinguish highly cited publications.
Information retrieval systems for scholarly literature rely heavily not only on text matching but on semantic-and context-based features. Readers nowadays are deeply interested in how important an article is, its purpose and how influential it is in follow-up research work. Numerous techniques to tap the power of machine learning and artificial intelligence have been developed to enhance retrieval of the most influential scientific literature. In this paper, we compare and improve on four existing state-of-the-art techniques designed to identify influential citations. We consider 450 citations from the Association for Computational Linguistics corpus, classified by experts as either important or unimportant, and further extract 64 features based on the methodology of four state-of-the-art techniques. We apply the Extra-Trees classifier to select 29 best features and apply the Random Forest and Support Vector Machine classifiers to all selected techniques. Using the Random Forest classifier, our supervised model improves on the state-of-the-art method by 11.25%, with 89% Precision-Recall area under the curve. Finally, we present our deep-learning model, the Long Short-Term Memory network, that uses all 64 features to distinguish important and unimportant citations with 92.57% accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.