Background: Prediction of protein solvent accessibility, also called accessible surface area (ASA) prediction, is an important step for tertiary structure prediction directly from one-dimensional sequences. Traditionally, predicting solvent accessibility is regarded as either a two-(exposed or buried) or three-state (exposed, intermediate or buried) classification problem. However, the states of solvent accessibility are not well-defined in real protein structures. Thus, a number of methods have been developed to directly predict the real value ASA based on evolutionary information such as position specific scoring matrix (PSSM).
This study presents the Yeast Promoter Atlas (YPA, http://ypa.ee.ncku.edu.tw/ or http://ypa.csbb.ntu.edu.tw/) database, which aims to collect comprehensive promoter features in Saccharomyces cerevisiae. YPA integrates nine kinds of promoter features including promoter sequences, genes’ transcription boundaries—transcription start sites (TSSs), five prime untranslated regions (5′-UTRs) and three prime untranslated regions (3′UTRs), TATA boxes, transcription factor binding sites (TFBSs), nucleosome occupancy, DNA bendability, transcription factor (TF) binding, TF knockout expression and TF–TF physical interaction. YPA is designed to present data in a unified manner as many important observations are revealed only when these promoter features are considered altogether. For example, DNA rigidity can prevent nucleosome packaging, thereby making TFBSs in the rigid DNA regions more accessible to TFs. Integrating nucleosome occupancy, DNA bendability, TF binding, TF knockout expression and TFBS data helps to identify which TFBS is actually functional. In YPA, various promoter features can be accessed in a centralized and organized platform. Researchers can easily view if the TFBSs in an interested promoter are occupied by nucleosomes or located in a rigid DNA segment and know if the expression of the downstream gene responds to the knockout of the corresponding TFs. Compared to other established yeast promoter databases, YPA collects not only TFBSs but also many other promoter features to help biologists study transcriptional regulation.
Background: MicroRNAs (miRNAs) are short non-coding RNA molecules participating in posttranscriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, ab initio approaches obtain more attention because that they can discover species-specific pre-miRNAs. Most ab initio approaches proposed novel features to characterize RNA molecules. However, there were fewer discussions on the associated classification mechanism in a miRNA predictor.
Geometrical analysis of protein tertiary substructures has been an effective approach employed to predict protein binding sites. This article presents the Protemot web server that carries out prediction of protein binding sites based on the structural templates automatically extracted from the crystal structures of protein–ligand complexes in the PDB (Protein Data Bank). The automatic extraction mechanism is essential for creating and maintaining a comprehensive template library that timely accommodates to the new release of PDB as the number of entries continues to grow rapidly. The design of Protemot is also distinctive by the mechanism employed to expedite the analysis process that matches the tertiary substructures on the contour of the query protein with the templates in the library. This expediting mechanism is essential for providing reasonable response time to the user as the number of entries in the template library continues to grow rapidly due to rapid growth of the number of entries in PDB. This article also reports the experiments conducted to evaluate the prediction power delivered by the Protemot web server. Experimental results show that Protemot can deliver a superior prediction power than a web server based on a manually curated template library with insufficient quantity of entries. Availability: .
Annotating protein functions and linking proteins with similar functions are important in systems biology. The rapid growth rate of newly sequenced genomes calls for the development of computational methods to help experimental techniques. Phylogenetic profiling (PP) is a method that exploits the evolutionary co-occurrence pattern to identify functional related proteins. However, PP-based methods delivered satisfactory performance only on prokaryotes but not on eukaryotes. This study proposed a two-stage framework to predict protein functional linkages, which successfully enhances a PP-based method with machine learning. The experimental results show that the proposed two-stage framework achieved the best overall performance in comparison with three PP-based methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.