The structure of a protein triple helix has been determined at 1.9 angstrom resolution by x-ray crystallographic studies of a collagen-like peptide containing a single substitution of the consensus sequence. This peptide adopts a triple-helical structure that confirms the basic features determined from fiber diffraction studies on collagen: supercoiling of polyproline II helices and interchain hydrogen bonding that follows the model II of Rich and Crick. In addition, the structure provides new information concerning the nature of this protein fold. Each triple helix is surrounded by a cylinder of hydration, with an extensive hydrogen bonding network between water molecules and peptide acceptor groups. Hydroxyproline residues have a critical role in this water network. The interaxial spacing of triple helices in the crystal is similar to that in collagen fibrils, and the water networks linking adjacent triple helices in the crystal structure are likely to be present in connective tissues. The breaking of the repeating (X-Y-Gly)n pattern by a Gly-->Ala substitution results in a subtle alteration of the conformation, with a local untwisting of the triple helix. At the substitution site, direct interchain hydrogen bonds are replaced with interstitial water bridges between the peptide groups. Similar conformational changes may occur in Gly-->X mutated collagens responsible for the diseases osteogenesis imperfecta, chondrodysplasias, and Ehlers-Danlos syndrome IV.
The roles of hydroxyproline and hydration are strongly interrelated in the structure of the collagen triple helix. The specific, repetitive water bridges observed in this structure buttress the triple-helical conformation. The extensively ordered hydration structure offers a good model for the interpretation of the experimental results on collagen stability and assembly.
Determination of the tendencies of amino acids to form alpha-helical and beta-sheet structures has been important in clarifying stabilizing interactions, protein design, and the protein folding problem. In this study, we have determined for the first time a complete scale of amino acid propensities for another important protein motif: the collagen triple-helix conformation with its Gly-X-Y repeating sequence. Guest triplets of the form Gly-X-Hyp and Gly-Pro-Y are used to quantitate the conformational propensities of all 20 amino acids for the X and Y positions in the context of a (Gly-Pro-Hyp)(8) host peptide. The rankings for both the X and Y positions show the highly stabilizing nature of imino acids and the destabilizing effects of Gly and aromatic residues. Many residues show differing propensities in the X versus Y position, related to the nonequivalence of these positions in terms of interchain interactions and solvent exposure. The propensity of amino acids to adopt a polyproline II-like conformation plays a role in their triple-helix rankings, as shown by a moderate correlation of triple-helix propensity with frequency of occurrence in polyproline II-like regions. The high propensity of ionizable residues in the X position suggests the importance of interchain hydrogen bonding directly or through water to backbone carbonyls or hydroxyprolines. The low propensity of side chains with branching at the C(delta) in the Y position supports models suggesting these groups block solvent access to backbone C=O groups. These data provide a first step in defining sequence-dependent variations in local triple-helix stability and binding, and are important for a general understanding of side chain interactions in all proteins.
An algorithm was derived to relate the amino acid sequence of a collagen triple helix to its thermal stability. This calculation is based on the triple helical stabilization propensities of individual residues and their intermolecular and intramolecular interactions, as quantitated by melting temperature values of host-guest peptides. Experimental melting temperature values of a number of triple helical peptides of varying length and sequence were successfully predicted by this algorithm. However, predicted T m values are significantly higher than experimental values when there are strings of oppositely charged residues or concentrations of like charges near the terminus. Application of the algorithm to collagen sequences highlights regions of unusually high or low stability, and these regions often correlate with biologically significant features. The prediction of stability from sequence indicates an understanding of the major forces maintaining this protein motif. The use of highly favorable KGE and KGD sequences is seen to complement the stabilizing effects of imino acids in modulating stability and may become dominant in the collagenous domains of bacterial proteins that lack hydroxyproline. The effect of single amino acid mutations in the X and Y positions can be evaluated with this algorithm. An interactive collagen stability calculator based on this algorithm is available online.The ability to predict structure and stability from amino acid sequence is an important step in the understanding of basic protein principles and the structural consequences of pathological mutations. The vast number of amino acid sequences available from DNA data contrasts with the smaller number of high resolution protein structures and the limited experimental data on protein stability. The ability to make predictions that are in good agreement with experimental data provides insight into the stabilizing interactions within proteins. In addition, there is much interest in computing the effect of single amino acid replacements on protein stability because destabilizing effects are associated with deleterious mutations that result in clinically detectable phenotypes (1-3). In contrast to globular proteins, the relation among sequence, structure, and stability is simpler and better defined for the linear collagen triple helix.The collagen triple helix motif is found widely in structural proteins of the extracellular matrix and in an increasing set of non-collagenous proteins, many of which are involved in host-defense functions (4, 5). The close packing of three supercoiled polyproline II-like polypeptide chains in the collagen triple helix generates a requirement for Gly as every third residue (6 -8). The observation of such a repeating (Gly-X-Y) n sequence pattern over a stretch of residues signifies a triple helix conformation. However, the collagen triple helix is not uniform in structure or stability. Crystal structures of collagen peptides show that variation in amino acid content leads to small but significant variations i...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.