In this study, we examined empirical results on the h index and its most important variants in order to determine whether the variants developed are associated with an incremental contribution for evaluation purposes. The results of a factor analysis using bibliographic data on postdoctoral researchers in biomedicine indicate that regarding the h index and its variants, we are dealing with two types of indices that load on one factor each. One type describes the most productive core of a scientist's output and gives the number of papers in that core. The other type of indices describes the impact of the papers in the core. Because an index for evaluative purposes is a useful yardstick for comparison among scientists if the index corresponds strongly with peer assessments, we calculated a logistic regression analysis with the two factors resulting from the factor analysis as independent variables and peer assessment of the postdoctoral researchers as the dependent variable. The results of the regression analysis show that peer assessments can be predicted better using the factor 'impact of the productive core' than using the factor 'quantity of the productive core.'
With the ready accessibility of bibliometric data and the availability of ready-to-use tools for generating bibliometric indicators for evaluation purposes, there is the danger of inappropriate use. Here we present standards of good practice for analyzing bibliometric data and presenting and interpreting the results. Comparisons drawn between research groups as to research performance are valid only if (1) the scientific impact of the research groups or their publications are looked at by using box plots, Lorenz curves, and Gini coefficients to represent distribution characteristics of data (in other words, going beyond the usual arithmetic mean value), (2) different reference standards are used to assess the impact of research groups, and the appropriateness of the reference standards undergoes critical examination, and (3) statistical analyses comparing citation counts take into consideration that citations are a function of many influencing factors besides scientific quality.
We submit newly developed citation impact indicators based not on arithmetic averages of citations but on percentile ranks. Citation distributions are-as a rule-highly skewed and should not be arithmetically averaged. With percentile ranks, the citation of each paper is rated in terms of its percentile in the citation distribution. The percentile ranks approach allows for the formulation of a more abstract indicator scheme that can be used to organize and/or schematize different impact indicators according to three degrees of freedom: the selection of the reference sets, the evaluation criteria, and the choice of whether or not to define the publication sets as independent. Bibliometric data of seven principal investigators (PIs) of the Academic Medical Center of the University of Amsterdam is used as an exemplary data set. We demonstrate that the proposed indicators [R(6), R(100), R(6,k), R(100,k)] are an improvement of averages-based indicators because one can account for the shape of the distributions of citations over papers.
BackgroundThis paper presents the first meta-analysis for the inter-rater reliability (IRR) of journal peer reviews. IRR is defined as the extent to which two or more independent reviews of the same scientific document agree.Methodology/Principal FindingsAltogether, 70 reliability coefficients (Cohen's Kappa, intra-class correlation [ICC], and Pearson product-moment correlation [r]) from 48 studies were taken into account in the meta-analysis. The studies were based on a total of 19,443 manuscripts; on average, each study had a sample size of 311 manuscripts (minimum: 28, maximum: 1983). The results of the meta-analysis confirmed the findings of the narrative literature reviews published to date: The level of IRR (mean ICC/r2 = .34, mean Cohen's Kappa = .17) was low. To explain the study-to-study variation of the IRR coefficients, meta-regression analyses were calculated using seven covariates. Two covariates that emerged in the meta-regression analyses as statistically significant to gain an approximate homogeneity of the intra-class correlations indicated that, firstly, the more manuscripts that a study is based on, the smaller the reported IRR coefficients are. Secondly, if the information of the rating system for reviewers was reported in a study, then this was associated with a smaller IRR coefficient than if the information was not conveyed.Conclusions/SignificanceStudies that report a high level of IRR are to be considered less credible than those with a low level of IRR. According to our meta-analysis the IRR of peer assessments is quite limited and needs improvement (e.g., reader system).
Peer review is valued in higher education, but also widely criticized in terms of potential biases, particularly gender. We evaluate gender differences in peer reviews of grant applications, extending Bornmann, Mutz, and Daniel's meta-analyses that reported small gender differences in favor of men (d = .04), but a substantial heterogeneity in effect sizes that compromised the robustness of their results. We contrast these findings with the most comprehensive single primary study (Marsh, Jayasinghe, and Bond) that found no gender differences for grant proposals. We juxtapose traditional (fixed-and random-effects) and multilevel models, demonstrating important advantages to the multilevel approach. Consistent with Marsh et al.'s primary study, there were no gender differences for the 40 (of 66) effect sizes from Bornmann et al. that were based on grant proposals. This lack of a gender effect for grant proposals was very robust, generalizing over country, discipline, and publication year
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.