The geometries are reported for interacting arginine-carboxyl pairs obtained from 37 high resolution protein structures solved to a resolution of 2.0 A or better. The closest interatomic distance between the guanidinium and carboxyl is less than 4.2 A for 74 arginine and carboxyl groups, with the majority of these lying within hydrogen-bonding distance (2.6-3.0 A). Interacting pairs have been transformed into a common orientation, and arginine-carboxyl, and carboxyl-arginine geometries have been calculated. This has been defined in terms of the spherical polar angles TO, T~o, and the angle P, between the guanidinium and carboxyl planes. Results show a clear preference for the guanidinium and carboxyl groups to be approximately coplanar, and for the carboxyl oxygens to hydrogen bond with the guanidinium nitrogens. Single nitrogensingle oxygen is the most common type of interaction, however twin nitrogen-twin oxygen interactions also occur frequently. The majority of these occur between the carboxyl oxygens and the NH 1 and NE atoms of the arginine, and are only rarely observed for NH 1 and NH2. The information presented may be of use in the modelling of arginine-carboxyl interactions within proteins.
Compound subsets, which may be screened where it is not feasible or desirable to screen all available compounds, may be designed using rational or random selection. Literature on the relative performance of random versus rational selection reports conflicting observations, possibly because some random subsets might be more representative than others and perform better than subsets designed by rational means, or vice versa. In order to address this likelihood, we simulated a large number of rationally designed subsets for evaluation against an equally large number of randomly generated subsets. We found that our rationally designed subsets give higher mean hit rates compared to those of the random ones. We also compared subsets comprising random plates with subsets of random compounds and found that, while the mean hit rate of both is the same, the former demonstrates more variation in the hit rate. The choice of compound file, rational subset method, and ratio of the subset size to the compound file size are key factors in the relative performance of random and rational selection, and statistical simulation is a viable way to identify the selection approach appropriate for a subset.
SATIS (simple atom type information system) is a protocol for the definition and automatic assignment of atom types and the classification of atoms according to their covalent connectivity. Its distinctive feature is that no bond type information is involved. Rather, the classification of each atom is based on a connectivity code describing the atom and its covalent partners. It is particularly useful when handling coordinate-based molecular representations with no bond order information, such as the PDB format. We survey the occurrence of the various connectivity codes in the 20 common amino acid residues in a sample of 304 different moieties from PDB protein-ligand complexes and also in a pseudo-random sample of 309 organic molecules from the CSD. We illustrate how connectivity codes can be grouped together to define atom types. We expect SATIS to be applicable to the derivation of atom types for statistical potentials, to the analysis of atomic interactions in structural databases, to studies of molecular similarity, and to the screening of virtual libraries in drug design.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.