Prediction of protein function is of significance in studying biological processes. One approach for function prediction is to classify a protein into functional family. Support vector machine (SVM) is a useful method for such classification, which may involve proteins with diverse sequence distribution. We have developed a web-based software, SVMProt, for SVM classification of a protein into functional family from its primary sequence. SVMProt classification system is trained from representative proteins of a number of functional families and seed proteins of Pfam curated protein families. It currently covers 54 functional families and additional families will be added in the near future. The computed accuracy for protein family classification is found to be in the range of 69.1-99.6%. SVMProt shows a certain degree of capability for the classification of distantly related proteins and homologous proteins of different function and thus may be used as a protein function prediction tool that complements sequence alignment methods. SVMProt can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.
A number of proteins and nucleic acids have been explored as therapeutic targets. These targets are subjects of interest in different areas of biomedical and pharmaceutical research and in the development and evaluation of bioinformatics, molecular modeling, computer-aided drug design and analytical tools. A publicly accessible database that provides comprehensive information about these targets is therefore helpful to the relevant communities. The Therapeutic Target Database (TTD) is designed to provide information about the known therapeutic protein and nucleic acid targets described in the literature, the targeted disease conditions, the pathway information and the corresponding drugs/ligands directed at each of these targets. Cross-links to other databases are also introduced to facilitate the access of information about the sequence, 3D structure, function, nomenclature, drug/ligand binding properties, drug usage and effects, and related literature for each target. This database can be accessed at http://xin.cz3.nus.edu.sg/group/ttd/ttd.asp and it currently contains entries for 433 targets covering 125 disease conditions along with 809 drugs/ligands directed at each of these targets. Each entry can be retrieved through multiple methods including target name, disease name, drug/ligand name, drug/ligand function and drug therapeutic classification.
One approach for facilitating protein function prediction is to classify proteins into functional families. Recent studies on the classification of G-protein coupled receptors and other proteins suggest that a statistical learning method, Support vector machines (SVM), may be potentially useful for protein classification into functional families. In this work, SVM is applied and tested on the classification of enzymes into functional families defined by the Enzyme Nomenclature Committee of IUBMB. SVM classification system for each family is trained from representative enzymes of that family and seed proteins of Pfam curated protein families. The classification accuracy for enzymes from 46 families and for non-enzymes is in the range of 50.0% to 95.7% and 79.0% to 100% respectively. The corresponding Matthews correlation coefficient is in the range of 54.1% to 96.1%. Moreover, 80.3% of the 8,291 correctly classified enzymes are uniquely classified into a specific enzyme family by using a scoring function, indicating that SVM may have certain level of unique prediction capability. Testing results also suggest that SVM in some cases is capable of classification of distantly related enzymes and homologous enzymes of different functions. Effort is being made to use a more comprehensive set of enzymes as training sets and to incorporate multi-class SVM classification systems to further enhance the unique prediction accuracy. Our results suggest the potential of SVM for enzyme family classification and for facilitating protein function prediction. Our software is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.
A new application of the genetic algorithm approach is introduced to solve printed circuit board assembly planning problems. The developed genetic algorithm finds the sequence of component placement/insertion and the arrangement of feeders simultaneously, for achieving the shortest assembly time, for three main types of assembly machines. The algorithm uses links (parents) to represent possible solutions and it applies genetic operators to generate new links (offspring) in an iterative procedure to obtain nearly optimal solutions. Examples are provided to illustrate solutions generated by the algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.