BackgroundMicroRNAs (miRNAs) are small non-coding RNA molecules that are ~22-nt-long sequences capable of suppressing protein synthesis. Previous research has suggested that miRNAs regulate 30% or more of the human protein-coding genes. The aim of this work is to consider various analyzing scenarios in the identification of miRNA-target interactions, as well as to provide an integrated system that will aid in facilitating investigation on the influence of miRNA targets by alternative splicing and the biological function of miRNAs in biological pathways.ResultsThis work presents an integrated system, miRTar, which adopts various analyzing scenarios to identify putative miRNA target sites of the gene transcripts and elucidates the biological functions of miRNAs toward their targets in biological pathways. The system has three major features. First, the prediction system is able to consider various analyzing scenarios (1 miRNA:1 gene, 1:N, N:1, N:M, all miRNAs:N genes, and N miRNAs: genes involved in a pathway) to easily identify the regulatory relationships between interesting miRNAs and their targets, in 3'UTR, 5'UTR and coding regions. Second, miRTar can analyze and highlight a group of miRNA-regulated genes that participate in particular KEGG pathways to elucidate the biological roles of miRNAs in biological pathways. Third, miRTar can provide further information for elucidating the miRNA regulation, i.e., miRNA-target interactions, affected by alternative splicing.ConclusionsIn this work, we developed an integrated resource, miRTar, to enable biologists to easily identify the biological functions and regulatory relationships between a group of known/putative miRNAs and protein coding genes. miRTar is now available at http://miRTar.mbc.nctu.edu.tw/.
Protein post-translational modifications (PTMs) play an important role in different cellular processes. In view of the importance of PTMs in cellular functions and the massive data accumulated by the rapid development of mass spectrometry (MS)-based proteomics, this paper presents an update of dbPTM with over 2 777 000 PTM substrate sites obtained from existing databases and manual curation of literature, of which more than 2 235 000 entries are experimentally verified. This update has manually curated over 42 new modification types that were not included in the previous version. Due to the increasing number of studies on the mechanism of PTMs in the past few years, a great deal of upstream regulatory proteins of PTM substrate sites have been revealed. The updated dbPTM thus collates regulatory information from databases and literature, and merges them into a protein-protein interaction network. To enhance the understanding of the association between PTMs and molecular functions/cellular processes, the functional annotations of PTMs are curated and integrated into the database. In addition, the existing PTM-related resources, including annotation databases and prediction tools are also renewed. Overall, in this update, we would like to provide users with the most abundant data and comprehensive annotations on PTMs of proteins. The updated dbPTM is now freely accessible at https://awi.cuhk.edu.cn/dbPTM/.
MicroRNAs (miRNAs) are involved in various biological processes by suppressing gene expression. A recent work has indicated that host miRNAs are also capable of regulating viral gene expression by targeting the virus genomes. To investigate regulatory relationships between host miRNAs and related viruses, we present a novel database, namely ViTa, to curate the known virus miRNA genes and the known/putative target sites of human, mice, rat and chicken miRNAs. Known miRNAs are obtained from miRBase. Virus data are collected and referred from ICTVdB, VBRC and VirGen. Experimentally validated miRNA targets on viruses were derived from literatures. Then, miRanda and TargetScan are utilized to predict miRNA targets within virus genomes. ViTa also provides the virus annotations, virus-infected tissues and tissue specificity of host miRNAs. This work also facilitates the comparisons between subtypes of viruses, such as influenza viruses, human liver viruses and the conserved regions between viruses. Both textual and graphical web interfaces are provided to facilitate the data retrieves in the ViTa database. The database is now freely available at .
Succinylation is a type of protein post-translational modification (PTM), which can play important roles in a variety of cellular processes. Due to an increasing number of site-specific succinylated peptides obtained from high-throughput mass spectrometry (MS), various tools have been developed for computationally identifying succinylated sites on proteins. However, most of these tools predict succinylation sites based on traditional machine learning methods. Hence, this work aimed to carry out the succinylation site prediction based on a deep learning model. The abundance of MS-verified succinylated peptides enabled the investigation of substrate site specificity of succinylation sites through sequence-based attributes, such as position-specific amino acid composition, the composition of k-spaced amino acid pairs (CKSAAP), and position-specific scoring matrix (PSSM). Additionally, the maximal dependence decomposition (MDD) was adopted to detect the substrate signatures of lysine succinylation sites by dividing all succinylated sequences into several groups with conserved substrate motifs. According to the results of ten-fold cross-validation, the deep learning model trained using PSSM and informative CKSAAP attributes can reach the best predictive performance and also perform better than traditional machine-learning methods. Moreover, an independent testing dataset that truly did not exist in the training dataset was used to compare the proposed method with six existing prediction tools. The testing dataset comprised of 218 positive and 2621 negative instances, and the proposed model could yield a promising performance with 84.40% sensitivity, 86.99% specificity, 86.79% accuracy, and an MCC value of 0.489. Finally, the proposed method has been implemented as a web-based prediction tool (CNN-SuccSite), which is now freely accessible at http://csb.cse.yzu.edu.tw/CNN-SuccSite/.
Background: Glutarylation, the addition of a glutaryl group (five carbons) to a lysine residue of a protein molecule, is an important post-translational modification and plays a regulatory role in a variety of physiological and biological processes. As the number of experimentally identified glutarylated peptides increases, it becomes imperative to investigate substrate motifs to enhance the study of protein glutarylation. We carried out a bioinformatics investigation of glutarylation sites based on amino acid composition using a public database containing information on 430 non-homologous glutarylation sites. Results: The TwoSampleLogo analysis indicates that positively charged and polar amino acids surrounding glutarylated sites may be associated with the specificity in substrate site of protein glutarylation. Additionally, the chi-squared test was utilized to explore the intrinsic interdependence between two positions around glutarylation sites. Further, maximal dependence decomposition (MDD), which consists of partitioning a large-scale dataset into subgroups with statistically significant amino acid conservation, was used to capture motif signatures of glutarylation sites. We considered single features, such as amino acid composition (AAC), amino acid pair composition (AAPC), and composition of k-spaced amino acid pairs (CKSAAP), as well as the effectiveness of incorporating MDD-identified substrate motifs into an integrated prediction model. Evaluation by five-fold cross-validation showed that AAC was most effective in discriminating between glutarylation and non-glutarylation sites, according to support vector machine (SVM). Conclusions: The SVM model integrating MDD-identified substrate motifs performed well, with a sensitivity of 0.677, a specificity of 0.619, an accuracy of 0.638, and a Matthews Correlation Coefficient (MCC) value of 0.28. Using an independent testing dataset (46 glutarylated and 92 non-glutarylated sites) obtained from the literature, we demonstrated that the integrated SVM model could improve the predictive performance effectively, yielding a balanced sensitivity and specificity of 0.652 and 0.739, respectively. This integrated SVM model has been implemented as a web-based system (MDDGlutar), which is now freely available at http://csb.cse.yzu. edu.tw/MDDGlutar/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.