Parameterization and Classification of the Protein Universe via Geometric Techniques

Tendulkar, Ashish V.; Wangikar, Pramod P.; Sohoni, Milind; Samant, Vivekanand V.; Mone, Chetan Y.

doi:10.1016/j.jmb.2003.09.021

Cited by 19 publications

(19 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We have previously demonstrated the utility of geometric invariants in clustering geometrically similar patterns of three to six amino acid residues. 23 This eliminates the computationally explosive process of finding out the superposing transformation for all the possible pairs of fragments. The results provide a finer classification of the known secondary structural categories of a-helices and b-strands in addition to providing thousands of new categories of nonregular secondary structures.…”

Section: Printsmentioning

confidence: 99%

“…Previously, we have described the utility of geometric invariants in detecting recurring structural patterns in the protein universe. 23 It is important to note that it is possible to determine the validity of X , Y via 612 geometric invariants without actually computing the transformation that superposes X with Y: 24 -26 The crucial point now is to be able to select a suite of invariants FðXÞ ¼ ð f 1 ðXÞ; f 2 ðXÞ; …; f k ðXÞÞ that we can use to co-ordinatize the configuration space C. In some norm on R k , if kðXÞ 2 ðYÞk , 1, then we declare X and Y to be superimposable. …”

mentioning

confidence: 99%

“…23 We select a collection of peptide segment structures, which are known to be superimposable with each other. For each invariant, a tolerance value, Window ðW i Þ; is chosen that exceeds the variance in the peptides in the training data.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Clustering of Protein Structural Fragments Reveals Modular Building Block Approach of Nature

Tendulkar

Joshi

Sohoni

et al. 2004

Journal of Molecular Biology

Self Cite

View full text Add to dashboard Cite

Structures of peptide fragments drawn from a protein can potentially occupy a vast conformational continuum. We co-ordinatize this conformational space with the help of geometric invariants and demonstrate that the peptide conformations of the currently available protein structures are heavily biased in favor of a finite number of conformational types or structural building blocks. This is achieved by representing a peptides' backbone structure with geometric invariants and then clustering peptides based on closeness of the geometric invariants. This results in 12,903 clusters, of which 2207 are made up of peptides drawn from functionally and/or structurally related proteins. These are termed "functional" clusters and provide clues about potential functional sites. The rest of the clusters, including the largest few, are made up of peptides drawn from unrelated proteins and are termed "structural" clusters. The largest clusters are of regular secondary structures such as helices and beta strands as well as of beta hairpins. Several categories of helices and strands are discovered based on geometric differences. In addition to the known classes of loops, we discover several new classes, which will be useful in protein structure modeling. Our algorithm does not require assignment of secondary structure and, therefore, overcomes the limitations in loop classification due to ambiguity in secondary structure assignment at loop boundaries.Keywords: geometric invariants; protein structure comparison; secondary structure; loop IntroductionGlobular proteins are made up of regular secondary structures such as a-helices and b-strands and non-regular secondary structures, which are referred to as loops. The regular secondary structures are defined based on a regular pattern of the backbone torsion angles for consecutive amino acid residues or periodic patterns of hydrogen bonding between the backbone NH and CvO groups.1 Loops are the regions that join the regular secondary structures and lack the regularity of torsion angles for consecutive residues. In fact, loops, polypeptide fragments from proteins, which trace a "loop-shaped" path in three-dimensional space, were considered to be random coils until recently.2 Loops are highly compact substructures and are typically situated at the protein surface. Certain loop geometries have been shown to recur across non-homologous proteins, which forms the basis for classification of loops into structural families. For example, the b-hairpin and a-turn families of loops have been described in detail.1,3 -7 A much wider classification of loops of various categories has been provided by Kwasigroch et al. 8 and by Oliva et al. 9 Loop classification has important implications in protein structure modeling and in fitting the NMR distance constraints or an X-ray crystallography electron density map to a loop geometry. On the contrary, no such fine-grained classification has been attempted for a-helices and b-strands, although detectable variations are reported in these regular secondar...

show abstract

Section: Printsmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Clustering of Protein Structural Fragments Reveals Modular Building Block Approach of Nature

Tendulkar

Joshi

Sohoni

et al. 2004

Journal of Molecular Biology

Self Cite

View full text Add to dashboard Cite

show abstract

“…The template matching is performed via superposition transformations or geometric hashing [24,8]. DReSPat proposed an efficient scheme for template matching using a set of geometric invariant descriptors [33].…”

Section: Related Workmentioning

confidence: 99%

Functional site prediction by exploiting correlations between labels of interacting residues

Kar¹,

Vijayakeerthi

Tendulkar

et al. 2012

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine

Self Cite

View full text Add to dashboard Cite

Functional site prediction is an important problem in the structural genomics era where we have a large number of experimentally determined protein structures with unknown function. The functional sites provide useful insights into protein function. In this paper, we propose a method for prediction of functional residues in a given protein from its three-dimensional (3D) structure. Our method exploits correlation between labels of interacting residues to obtain significant performance improvements over the existing methods on the benchmark dataset. We represent each protein as a weighted undirected residue interaction network, where spatially proximal residues in terms of their Van der Waal radii are connected by an edge. The edge weight captures correlation between the labels of interacting residues. The correlation is estimated based on the features of interacting residues. We then obtain a label assignment by minimizing combined cost of residue-wise label misclassification and violation of label correlation constraints. We solve this problem in two stages, where the first stage minimizes residue-wise label misclassification cost followed by an iterative collective inference scheme that adjusts the labels predicted in the first stage so as to minimize the correlation constraint violations. Our approach significantly outperforms state of the art methods on standard benchmark dataset. It achieves 23.06% precision at 69% recall and 87.78% recall at 18% precision, which translates to an improvement of 5.06 percentage points in the precision at 69% recall and 18.78 percentage point improvement in recall at 18% precision.

show abstract

“…Another example is given by the active site of enolase superfamily, which can be accurately characterized by the spatial arrangement of five residues. 4 A number of methods [5][6][7][8][9][10][11][12][13][14][15][16] were developed to identify this type of structural motif, taking advantage of the distance constraints of the conserved residues.…”

Section: Introductionmentioning

confidence: 99%

The fragment transformation method to detect the protein structural motifs

Lin

Chen

et al. 2006

Proteins

View full text Add to dashboard Cite

To identify functional structural motifs from protein structures of unknown function becomes increasingly important in recent years due to the progress of the structural genomics initiatives. Although certain structural patterns such as the Asp-His-Ser catalytic triad are easy to detect because of their conserved residues and stringently constrained geometry, it is usually more challenging to detect a general structural motifs like, for example, the ␤␤␣-metal binding motif, which has a much more variable conformation and sequence. At present, the identification of these motifs usually relies on manual procedures based on different structure and sequence analysis tools. In this study, we develop a structural alignment algorithm combining both structural and sequence information to identify the local structure motifs. We applied our method to the following examples: the ␤␤␣-metal binding motif and the treble clef motif. The ␤␤␣-metal binding motif plays an important role in nonspecific DNA interactions and cleavage in host defense and apoptosis. The treble clef motif is a zinc-binding motif adaptable to diverse functions such as the binding of nucleic acid and hydrolysis of phosphodiester bonds. Our results are encouraging, indicating that we can effectively identify these structural motifs in an automatic fashion. Our method may provide a useful means for automatic functional annotation through detecting structural motifs associated with particular functions. Proteins 2006;63:636 -643.

show abstract

Parameterization and Classification of the Protein Universe via Geometric Techniques

Cited by 19 publications

References 53 publications

Clustering of Protein Structural Fragments Reveals Modular Building Block Approach of Nature

Clustering of Protein Structural Fragments Reveals Modular Building Block Approach of Nature

Functional site prediction by exploiting correlations between labels of interacting residues

The fragment transformation method to detect the protein structural motifs

Contact Info

Product

Resources

About