ProMOL, a plugin for the PyMOL molecular graphics system, is a structure-based protein function prediction tool. ProMOL includes a set of routines for building motif templates that are used for screening query structures for enzyme active sites. Previously, each motif template was generated manually and required supervision in the optimization of parameters for sensitivity and selectivity. We developed an algorithm and workflow for the automation of motif building and testing routines in ProMOL. The algorithm uses a set of empirically derived parameters for optimization and requires little user intervention. The automated motif generation algorithm was first tested in a performance comparison with a set of manually generated motifs based on identical active sites from the same 112 PDB entries. The two sets of motifs were equally effective in identifying alignments with homologs and in rejecting alignments with unrelated structures. A second set of 296 active site motifs were generated automatically, based on Catalytic Site Atlas entries with literature citations, as an expansion of the library of existing manually generated motif templates. The new motif templates exhibited comparable performance to the existing ones in terms of hit rates against native structures, homologs with the same EC and Pfam designations, and randomly selected unrelated structures with a different EC designation at the first EC digit, as well as in terms of RMSD values obtained from local structural alignments of motifs and query structures. This research is supported by NIH grant GM078077.
ProMOL is a plugin for the molecular visualization program PyMOL and is designed to align proposed catalytic residues from proteins of unknown function to those of known and catalogued functions. Roughly 4% of the structures in the Protein Data Bank (PDB) are proteins with unknown functions. The focus of this study is a test of the reliability of function prediction by ProMOL, based on analysis of several thousand structures of known function. Initial structural alignments using standard settings within ProMOL found that approximately 44% of these 10,000 structures of known function were properly identified using 296 motif templates in the ProMOL library if homologs are selected solely by EC class. However, we found that if we define homologs as those structures that share both EC class and Pfam family, the identification rate improved significantly from 44% to 66%. We are further refining the algorithm in ProMOL to include sequence alignment to further improve the true positive rate.Support or Funding InformationThis work was supported in part by NIH GM078077.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.