* Background In the search for therapeutic peptides for disease treatments, many efforts have been made to identify various functional peptides from large numbers of peptide sequence databases. In this paper, we propose an effective computational model that uses deep learning and word2vec to predict therapeutic peptides (PTPD). * Results Representation vectors of all k -mers were obtained through word2vec based on k -mer co-existence information. The original peptide sequences were then divided into k -mers using the windowing method. The peptide sequences were mapped to the input layer by the embedding vector obtained by word2vec. Three types of filters in the convolutional layers, as well as dropout and max-pooling operations, were applied to construct feature maps. These feature maps were concatenated into a fully connected dense layer, and rectified linear units (ReLU) and dropout operations were included to avoid over-fitting of PTPD. The classification probabilities were generated by a sigmoid function. PTPD was then validated using two datasets: an independent anticancer peptide dataset and a virulent protein dataset, on which it achieved accuracies of 96% and 94%, respectively. * Conclusions PTPD identified novel therapeutic peptides efficiently, and it is suitable for application as a useful tool in therapeutic peptide design.
The dominant approaches for named entity recognition (NER) mostly adopt complex recurrent neural networks (RNN), e.g., long-short-term-memory (LSTM). However, RNNs are limited by their recurrent nature in terms of computational efficiency. In contrast, convolutional neural networks (CNN) can fully exploit the GPU parallelism with their feedforward architectures. However, little attention has been paid to performing NER with CNNs, mainly owing to their difficulties in capturing the long-term context information in a sequence. In this paper, we propose a simple but effective CNN-based network for NER, i.e., gated relation network (GRN), which is more capable than common CNNs in capturing long-term context. Specifically, in GRN we firstly employ CNNs to explore the local context features of each word. Then we model the relations between words and use them as gates to fuse local context features into global ones for predicting labels. Without using recurrent layers that process a sentence in a sequential manner, our GRN allows computations to be performed in parallel across the entire sentence. Experiments on two benchmark NER datasets (i.e., CoNLL-2003 and Ontonotes 5.0) show that, our proposed GRN can achieve state-of-the-art performance with or without external knowledge. It also enjoys lower time costs to train and test. We have made the code publicly available at https://github.com
mRNA m5C, which has recently been implicated in the regulation of mRNA mobility, metabolism, and translation, plays important regulatory roles in various biological events. Two types of m5C sites are found in mRNAs. Type I m5C sites, which contain a downstream G-rich triplet motif and are computationally predicted to locate in the 5’ end of putative hairpin structures, are methylated by NSUN2. Type II m5C sites contain a downstream UCCA motif and are computationally predicted to locate in the loops of putative hairpin structures. However, their biogenesis remains unknown. Here we identified NSUN6, a methyltransferase that is known to methylate C72 of tRNAThr and tRNACys, as an mRNA methyltransferase that targets Type II m5C sites. Combining the RNA secondary structure prediction, miCLIP, and results from a high-throughput mutagenesis analysis, we determined the RNA sequence and structural features governing the specificity of NSUN6-mediated mRNA methylation. Integrating these features into an NSUN6-RNA structural model, we identified an NSUN6 variant that largely loses tRNA methylation but retains mRNA methylation ability. Finally, we revealed a weak negative correlation between m5C methylation and translation efficiency. Our findings uncover that mRNA m5C is tightly controlled by an elaborate two-enzyme system, and the protein-RNA structure analysis strategy established may be applied to other RNA modification writers to distinguish the functions of different RNA substrates of a writer protein.
Plasmid conjugation is one of the dominant mechanisms of horizontal gene transfer, playing a noticeable role in the rapid spread of antibiotic resistance genes (ARGs). Broad host range plasmids are known to transfer to diverse bacteria in extracted soil bacterial communities when evaluated by filter mating incubation. However, the persistence and dissemination of broad range plasmid in natural soil has not been well studied. In this study, Pseudomonas putida with a conjugative antibiotic resistance plasmid RP4 was inoculated into a soil microcosm, the fate and persistence of P. putida and RP4 were monitored by quantitative PCR. The concentrations of P. putida and RP4 both rapidly decreased within 15-day incubation. P. putida then decayed at a significantly lower rate during subsequent incubation, however, no further decay of RP4 was observed, resulting in an elevated RP4/ P. putida ratio (up to 10) after 75-day incubation, which implied potential transfer of RP4 to soil microbiota. We further sorted RP4 recipient bacteria from the soil microcosms by fluorescence-activated cell sorting. Spread of RP4 increased during 75-day microcosm operation and was estimated at around 10 -4 transconjugants per recipient at the end of incubation. Analysis of 16S rRNA gene sequences of transconjugants showed that host bacteria of RP4 were affiliated to more than 15 phyla, with increased diversity and shift in the composition of host bacteria. Proteobacteria was the most dominant phylum in the transconjugant pools. Transient transfer of RP4 to some host bacteria was observed. These results emphasize the prolonged persistence of P. putida and RP4 in natural soil microcosms, and highlight the potential risks of increased spread potential of plasmid and broader range of host bacteria in disseminating ARGs in soil.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.