This paper measures social media activity of 15 broad scientific disciplines indexed in Scopus database using Altmetric.com data. First, the presence of Altmetric.com data in Scopus database is investigated, overall and across disciplines. Second, the correlation between the bibliometric and altmetric indices is examined using Spearman correlation. Third, a zero-truncated negative binomial model is used to determine the association of various factors with increasing or decreasing citations. Lastly, the effectiveness of altmetric indices to identify publications with high citation impact is comprehensively evaluated by deploying Area Under the Curve (AUC) -an application of receiver operating characteristic. Results indicate a rapid increase in the presence of Altmetric.com data in Scopus database from 10.19% in 2011 to 20.46% in 2015. A zero-truncated negative binomial model is implemented to measure the extent to which different bibliometric and altmetric factors contribute to citation counts. Blog count appears to be the most important factor increasing the number of citations by 38.6% in the field of Health Professions and Nursing, followed by Twitter count increasing the number of citations by 8% in the field of Physics and Astronomy. Interestingly, both Blog count and Twitter count always show positive increase in the number of citations across all fields. While there was a positive weak correlation between bibliometric and altmetric indices, the results show that altmetric indices can be a good indicator to discriminate highly cited publications, with an encouragingly AUC= 0.725 between highly cited publications and total altmetric count. Overall, findings suggest that altmetrics could better distinguish highly cited publications.
Information retrieval systems for scholarly literature rely heavily not only on text matching but on semantic-and context-based features. Readers nowadays are deeply interested in how important an article is, its purpose and how influential it is in follow-up research work. Numerous techniques to tap the power of machine learning and artificial intelligence have been developed to enhance retrieval of the most influential scientific literature. In this paper, we compare and improve on four existing state-of-the-art techniques designed to identify influential citations. We consider 450 citations from the Association for Computational Linguistics corpus, classified by experts as either important or unimportant, and further extract 64 features based on the methodology of four state-of-the-art techniques. We apply the Extra-Trees classifier to select 29 best features and apply the Random Forest and Support Vector Machine classifiers to all selected techniques. Using the Random Forest classifier, our supervised model improves on the state-of-the-art method by 11.25%, with 89% Precision-Recall area under the curve. Finally, we present our deep-learning model, the Long Short-Term Memory network, that uses all 64 features to distinguish important and unimportant citations with 92.57% accuracy.
The pandemic has taken the world by storm. Almost the entire world went into lockdown to save the people from the deadly COVID-19. Scientists around the around have come up with several vaccines for the virus. Amongthem, Pfizer, Moderna, and AstraZeneca have become quite famous. General people however have been expressing their feelings about the safety and effectiveness of the vaccines on social media like Twitter. In this study, such tweets are being extracted from Twitter using a Twitter API authentication token. The raw tweets are stored and processed using NLP. The processed data is then classified using a supervised KNN classification algorithm. The algorithm classifies the data into three classes, positive, negative, and neutral. These classes refer to the sentiment of the general people whose Tweets are extracted for analysis. From the analysis it is seen that Pfizer shows 47.29%positive, 37.5% negative and 15.21% neutral, Moderna shows 46.16%positive, 40.71% negative, and 13.13% neutral, AstraZeneca shows 40.08%positive, 40.06% negative and 13.86% neutral sentiment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.