Stylometry is the statistical analyses of variations in the author's literary style. The technique has been used in many linguistic analysis applications, such as, author profling, authorship identifcation, and authorship verifcation. Over the past two decades, authorship identifcation has been extensively studied by researchers in the area of natural language processing. However, these studies are generally limited to (i) a small number of candidate authors, and (ii) documents with similar lengths. In this paper, we propose a novel solution by modeling authorship attribution as a set similarity problem to overcome the two stated limitations. We conducted extensive experimental studies on a real dataset collected from an online book archive, Project Gutenberg. Experimental results show that in comparison to existing stylometry studies, our proposed solution can handle a larger number of documents of different lengths written by a larger pool of candidate authors with a high accuracy.
Stylometry is a statistical technique used to analyze the variations in the author's writing styles and is typically applied to authorship attribution problems. In this investigation, we apply stylometry to authorship identification of multi-author documents (AIMD) task. We propose an AIMD technique called Co-Authorship Graph (CAG) which can be used to collaboratively attribute different portions of documents to different authors belonging to the same community. Based on CAG, we propose a novel AIMD solution which (i) significantly outperforms the existing state-of-the-art solution; (ii) can effectively handle a larger number of co-authors; and (iii) is capable of handling the case when some of the listed co-authors have not contributed to the document as a writer. We conducted an extensive experimental study to compare the proposed solution and the best existing AIMD method using real and synthetic datasets. We show that the proposed solution significantly outperforms existing state-of-the-art method.
Purpose The purpose of this paper is to analyze the scientific collaboration of institutions and its impact on institutional research performance in terms of productivity and quality. The researchers examined the local and international collaborations that have a great impact on institutional performance. Design/methodology/approach Collaboration dependence measure was used to investigate the impact of an institution on external information. Based on this information, the authors used “index of gain in impact through collaboration” to find the impact of collaborated publications in institutional research performance. Bibliographic data between 1996 and 2010 retrieved from Scopus were used to conduct current study. The authors carried out the case study of top institutes of Pakistan in terms of publication count to elaborate the difference between high performing institutions and those who gain disproportionally in terms of perceived quality of their output because of local or international collaboration. Findings The results showed that the collaboration of developing countries institutes on international level had a great impact on institutional performance and they gain more benefit than local collaboration. Altogether, the scientific collaboration has a positive impact on institutional performance as measured by the cumulative source normalized impact per paper of their publications. The findings could also help researchers to find out appropriate collaboration partners. Originality/value This study has revealed some salient characteristics of collaboration in academic research. It becomes apparent that collaboration intensity is not uniform, but in general, the average quality of scientific production is the variable that most often correlates positively with the collaboration intensity of universities.
The traditional bibliometric techniques gauge the research impact through citation-based quantitative indices. However, due to citation lag time, it may take years to address the impact of an article. This paper seeks to measure an early impact of research articles using tweet sentiments associated with them. We claim that the papers cited in positive and neutral tweets have a higher impact than those not cited or cited in negative tweets. Accordingly, we use SentiStrenth, and we improve it by incorporating new opinion bearing words of scientific domain in its sentiment lexicons. Then, we classify the sentiment of 6,482,260 tweets linked to 1,083,535 publications covered by Altmetric.com. By using positive and negative tweets as an independent variable and the citation count as the dependent variable, the linear regression analysis shows a weak positive prediction of high citation counts across 16 broad disciplines in Scopus. By introducing an additional indicator, i.e. 'number of unique Twitter users,' the regression model improves the adjusted R-squared value of regression analysis in several disciplines. Overall, the encouraging positive correlation between the tweet sentiments and citations show that Twitterbased opinion may be exploited as a complementary indicator for predicting literature's early impact.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.