Effective interresidue contact energies for proteins in solution are estimated from the numbers of residue-residue contacts observed in crystal structures of globular proteins by means of the quasi-chemical approximation with an approximate treatment of the effects of chain connectivity. Employing a lattice model, each residue of a protein is assumed to occupy a site in a lattice and vacant sites are regarded to be occupied by an effective solvent molecule whose size is equal to the average size of a residue. A basic assumption is that the average characteristics of residue-residue contacts formed in a large number of protein crystal structures reflect actual differences of interactions among residues, as if there were no significant contribution from the specific amino acid sequence in each protein as well as intraresidue and short-range interactions. Then, taking account of the effects of the chain connectivity only as imposing a limit to the size of the system, i.e., the number of lattice sites or the number of effective solvent molecules in the system, the system is regarded to be the mixture of unconnected residues and effective solvent molecules. The quasi-chemical approximation, that contact pair formation resembles a chemical reaction, is applied to this system to obtain formulas that relate the statistical averages of the numbers of contacts to the contact energies. The number of effective solvent molecules for each protein is chosen to yield the total number of residue-residue contacts equal to its expected value for the hypothetical case of hard sphere interactions among residues and effective solvent molecules; the expected number of residue-residue contacts at this condition has been crudely estimated by means of a freely jointed chain distribution and an expansion originating in hard sphere interactions. Each residue is represented by the center of its side chain atom positions, and contacts among residues and effective solvent molecules are defined to be those pairs within 6.5 Á, a distance that has been chosen on the basis of the observed radial distribution of residues; nearest-neighbor pairs along a chain are explicitly excluded in counting contacts. Coordination numbers, for each type of residue as well as for solvent molecules, are estimated from the mean volume of each type of residue and used to evaluate the numbers of residue-solvent and solvent-solvent contacts from the numbers of residue-residue contacts. The estimated values of contact energies have reasonable residue-type dependences, reflecting residue distributions in protein crystals; nonpolar-residue-in and polar-residue-out are seen as well as the segregation of those residue groups. In addition, there is a linear relationship between the average contact energies for nonpolar residues and their hydrophobicities reported by Nozaki and Tanford; however, the magnitudes on average are about twice as large. The relevance of results to protein folding and other applications are discussed.
The frequency of amino acid substitutions, relative to the frequency expected by chance, decreases linearly with the increase in physico-chemical differences between amino acid pairs involved in a substitution. This correlation does not apply to abnormal human hemoglobins. Since abnormal hemoglobins mostly reflect the process of mutation rather than selection, the correlation manifest during protein evolution between substitution frequency and physico-chemical difference in amino acids can be attributed to natural selection. Outside of 'abnormal' proteins, the correlation also does not apply to certain regions of proteins characterized by rapid rates of substitution. In these cases again, except for the largest physico-chemical differences between amino acid pairs, the substitution frequencies seem to be independent of the physico-chemical parameters. The limination of the substituents involving the largest physico-chemical differences can once more be attributed to natural selection. For smaller physico-chemical differences, natural selection, if it is operating in the polypeptide regions, must be based on parameters other than those examined.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.