The quantity R sym (also called R merge ) is almost universally used for describing X-ray diffraction data quality. Here, we prove that R sym is seriously flawed, because it has an implicit dependence on the redundancy of the data. A corrected R-factor, R meas ' is introduced as the equivalent robust indicator of data consistency. In addition, we introduce R mrgd , an R-factor that reflects the gain in accuracy upon averaging ofequivalent reflections, as a useful indicator of the quality of reduced data. These new data quality indicators better reveal the benefits of highly redundant data and should stimulate improvements in data quality through increased merging of data from multiple crystals.R syrn (sometimes called Rrnerge)' is the most widespread statistic used to indicate data quality for rnacromolecular crystallographyl,2, and with the advent of area detectors for small molecule crystallography it is a standard quality indicator for that field as welP. It is defined as:Arndt 4 introduced R svrn as a reliability indicator for data collected by precession photography, where R svrn was specifically summed over symmetry-related intensities on the same film, and R sca ' calculated in an analogous fashion, reported the agreement of identical reflections measured on different films. As oscillation photography was introduced, so that symmetry related reflections were not commonly on the same film, it appears that the original R svrn and R sca were combined into the present day R svlTI which is summed over all observed equivalent reflections. 'R svrn is commonly used to guide decisions during data reduction,' such as determining to what resolution data are reliable, and whether two crystals are isomorphous, so that their data should be merged together. A single R svrn value is generally reported in publications to summarize the data quality. Overall R SVlTI values of <5%, 5-10%, 10-20% and >20% are taken to indicate good, usable, marginal, and questionable quality data respectively2.Here, we present empirical and mathematical analyses proving that R syrn is an inherently unreliable indicator of data quality. We also present alternate indicators that provide more robust measures of the quality of the individual measurements as well as of the final reduced data set. We expect that the application of the ideas described here will result in improved primary data quality, and ultimately in more accurate macromolecular structures.
Experimental dataThe analyses presented here hold true for diffraction data measured with various detector/software combinations, but for simplicity, we present analyses based on three sets of data collected from crystals of the enzyme urease with Cys 319 from the a-chain mutated to Alas. These crystals are isomorphous with wild-type urease and grow in space group 12[3 with a=170.8 A6,7. Independent 2 A resolution data sets were collected from each of three crystals which had been soaked at pH values of 6.5, 7.5 and 8.5; these are designated Ure_1, Ure_2, and Ure_3 respectively. Difference Four...