Selection of a representative set of structures from brookhaven protein data bank

Boberg, Jorma; Salakoski, Tapio; Vihinen, Mauno

doi:10.1002/prot.340140212

Cited by 50 publications

(33 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The criteria used for selection are: (1) a resolution of 2.5 A or better, ( 2 ) an R-factor of less than 25%, (3) a monomeric or homooligomeric structure, (4) the exclusion of prosthetic groups, and (5) good geometry defined by w , the dihedral angle made by the peptide bond, less than f 15" from ideality. Effort was made to include examples from many classes of protein structures (Boberg et al, 1992).…”

Section: Database Criteriamentioning

confidence: 99%

Verification of protein structures: Patterns of nonbonded atomic interactions

1993

View full text Add to dashboard Cite

A novel method for differentiating between correctly and incorrectly determined regions of protein structures based on characteristic atomic interactions is described. Different types of atoms are distributed nonrandomly with respect to each other in proteins. Errors in model building lead to more randomized distributions of the different atom types, which can be distinguished from correct distributions by statistical methods.Atoms are classified in one of three categories: carbon (C), nitrogen (N), and oxygen (0). This leads to six different combinations of pairwise noncovalently bonded interactions (CC, CN, CO, NN, NO, and 00). A quadratic error function is used to characterize the set of pairwise interactions from nine-residue sliding windows in a database of 96 reliable protein structures. Regions of candidate protein structures that are mistraced or misregistered can then be identified by analysis of the pattern of nonbonded interactions from each window.

show abstract

Section: Database Criteriamentioning

confidence: 99%

Verification of protein structures: Patterns of nonbonded atomic interactions

1993

View full text Add to dashboard Cite

show abstract

“…The entries were assigned to 3 types of dominant secondary structure according to the rules given by Equation 10: a-helix ( a ) , P-strand (p), and ab. The corresponding assignments for the entries of the unbiased set RS2 (Boberg et al, 1992) obtained with the DSSP package and the rules of Equation 9 are shown also in Table 3. The application of the rules of Boberg et al (1992) gives almost the same assignment as is proposed for the unbiased set.…”

Section: Secondary Structure Classes Enzymes and Proteins Without Enzmentioning

confidence: 99%

“…The corresponding assignments for the entries of the unbiased set RS2 (Boberg et al, 1992) obtained with the DSSP package and the rules of Equation 9 are shown also in Table 3. The application of the rules of Boberg et al (1992) gives almost the same assignment as is proposed for the unbiased set. This result shows that our algorithm for the calculation of the percentages of secondary structure is sufficiently accurate for the purpose of structural classification.…”

Section: Secondary Structure Classes Enzymes and Proteins Without Enzmentioning

confidence: 99%

“…Thus, it is a serious problem to construct a representative set of proteins with minimal sequence homology. Boberg et al (1992) have proposed an unbiased representative set of 103 proteins obtained by sequence alignment of the PDB structures with the GCG program GAP (Devereux et al, 1984), statistical estimation of the significance of sequence similarity (Lipman et al, 1985), and an original clustering algorithm. Eighty-four proteins of 103 from the unbiased set proposed by Boberg et al (1992) coincide with those selected in RSl; the other 19 entries, characterized by a resolution of 2 3 A or by incomplete atomic coordinates, were not appropriate for this study.…”

Section: 5 S U P~~-l T~p U P ~ 1 6 S P ~" 2 T~p U~~ 2 3 mentioning

confidence: 99%

See 1 more Smart Citation

Optimization of the electrostatic interactions in proteins of different functional and folding type

1994

View full text Add to dashboard Cite

The 3-dimensional optimization of the electrostatic interactions between the charged amino acid residues was studied by Monte Carlo simulations on an extended representative set of 141 protein structures with known atomic coordinates. The proteins were classified by different functional and structural criteria, and the optimization of the electrostatic interactions was analyzed. The optimization parameters were obtained by comparison of the contribution of charge-charge interactions to the free energy of the native protein structures and for a large number of randomly distributed charge constellations obtained by the Monte Carlo technique. On the basis of the results obtained, one can conclude that the charge-charge interactions are better optimized in the enzymes than in the proteins without enzymatic functions. Proteins that belong to the mixed cr/3 folding type are electrostatically better optimized than pure cu-helical or /3-strand structures. Proteins that are stabilized by disulfide bonds show a lower degree of electrostatic optimization. The electrostatic interactions in a native protein are effectively optimized by rejection of the conformers that lead to repulsive charge-charge interactions. Particularly, the rejection of the repulsive contacts seems to be a major goal in the protein folding process. The dependence of the optimization parameters on the choice of the potential function was tested. The majority of the potential functions gave practically identical results. Keywords: energy calculations; ion pairs; Monte Carlo simulations; potential functions; protein electrostatics; protein foldingThe nature and spatial distribution of charged residues in a folded protein can be considered as an evolutionary solution of 2 different tasks: first, the stabilization of the native structure by the contribution of the electrostatic interactions to the free energy and, second, the display of a functional role by creating a specific electrostatic field, necessary for the enhancement of the enzymatic reactions, intermolecular recognition, and assembly. In principle, the solution of the second task could be opposite to the stabilization effect. The magnitude of the stabilization effect of the other forces governing protein folding could counteract the necessity of significant electrostatic interactions.

show abstract

“…22] by using a multidimensional scaling (MDS) (23,24) procedure. Now, we have extended the method to a nonredundant protein structure data set from PDB SELECT (25,26) and constructed a ''protein structure space'' map. We noticed that proteins sharing similar molecular functions are located in the vicinity of each other in the structure space map (SSM).…”

mentioning

confidence: 99%

Global mapping of the protein structure space and application in structure-based inference of protein function

Hou

Jun

Zhang

et al. 2005

Proc. Natl. Acad. Sci. U.S.A.

101

View full text Add to dashboard Cite

We have constructed a map of the ''protein structure space'' by using the pairwise structural similarity scores calculated for all nonredundant protein structures determined experimentally. As expected, proteins with similar structures clustered together in the map and the overall distribution of structural classes of this map followed closely that of the map of the ''protein fold space'' we have reported previously. Consequently, proteins sharing similar molecular functions also were found to colocalize in the protein structure space map, pointing toward a previously undescribed scheme for structure-based functional inference for remote homologues based on the proximity in the map of the protein structure space. We found that this scheme consistently outperformed other predictions made by using either the raw scores or normalized Z-scores of pairwise DALI structure alignment. global map of protein universe ͉ multivariate analysis ͉ protein function prediction ͉ protein structure universe T he molecular functions of a protein can be inferred from either its sequence or structure information. Sequence-based function inference methods annotate molecular function of a protein from its sequence homologues. Most genome-wide functional annotations are carried out with this scheme, by using sequence alignment tools such as BLAST (1), or motif͞profile-based search tools such as PROSITE (2, 3) and PFAM (4, 5). However, when two functionally similar proteins do not share detectable sequence homology, molecular function cannot be inferred based solely on sequence information. Low sequence homology results either from an early branching point at the protein evolution (also known as remote homologues) or a convergent evolution. Many studies were focused on the detection of remote homologues (6-8). In general, methods using statistical models extracted from multiply aligned sequences perform better than pairwise sequence comparison methods (9). However, even these improved methods fail to recognize remote homologues with sequence identity Ͻ25-30%, which is estimated to be Ͼ25% of all sequenced proteins.Structure-based function inference, however, depends less on sequence information. During protein evolution, homology on sequence level is far less preserved compared with homology on structure level. Because proteins fold into specific structures to perform their molecular functions, structure-based functional inference is able to characterize remote homologous relationships of proteins that are impossible to detect by using sequences. By using different random sampling methods and similarity measuring functions, a large number of structural alignment algorithms have been developed to measure similarity of a pair of protein structures. Among these algorithms, DALI (10), SSAP (11), CE (12), and VAST (13) have been widely used, and their performances have been assessed [see Koehl (14) for a review].The issue of predicting the function of remote homologues has become more prominent recently: the Structural Genomics initiative (15...

show abstract

Selection of a representative set of structures from brookhaven protein data bank

Cited by 50 publications

References 21 publications

Verification of protein structures: Patterns of nonbonded atomic interactions

Verification of protein structures: Patterns of nonbonded atomic interactions

Optimization of the electrostatic interactions in proteins of different functional and folding type

Global mapping of the protein structure space and application in structure-based inference of protein function

Contact Info

Product

Resources

About