Sugarcane is an important tropical crop mainly cultivated to produce ethanol and sugar. Crop productivity is negatively affected by Acidovorax avenae subsp avenae (Aaa), which causes the red stripe disease. Little is known about the molecular mechanisms triggered in response to the infection. We have investigated the molecular mechanism activated in sugarcane using a RNA-seq approach. We have produced a de novo transcriptome assembly (TR7) from sugarcane RNA-seq libraries submitted to drought and infection with Aaa. Together, these libraries present 247 million of raw reads and resulted in 168,767 reference transcripts. Mapping in TR7 of reads obtained from infected libraries, revealed 798 differentially expressed transcripts, of which 723 were annotated, corresponding to 467 genes. GO and KEGG enrichment analysis showed that several metabolic pathways, such as code for proteins response to stress, metabolism of carbohydrates, processes of transcription and translation of proteins, amino acid metabolism and biosynthesis of secondary metabolites were significantly regulated in sugarcane. Differential analysis revealed that genes in the biosynthetic pathways of ET and JA PRRs, oxidative burst genes, NBS-LRR genes, cell wall fortification genes, SAR induced genes and pathogenesis-related genes (PR) were upregulated. In addition, 20 genes were validated by RT-qPCR. Together, these data contribute to a better understanding of the molecular mechanisms triggered by the Aaa in sugarcane and opens the opportunity for the development of molecular markers associated with disease tolerance in breeding programs.
Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane (Saccharum spp.) and in maize (Zea mays). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.
DNA sequencers output a large set of very long biological data strings that we should persist in databases rather than basic text file systems. Many different data models and database management systems (DBMS) may deal with both storage and efficiency issues regarding genomic datasets. Specifically, there is a need for handling strings with variable sizes while keeping their biological meaning. Relational database management systems (RDBMS) provide several data types that could be further explored for the genomics context. Besides, they enforce integrity, consistency, and enable good abstractions for more conventional data. We propose the relational text data type to represent and manipulate biological sequences and their derivatives. We present a logical schema for representing the core biological information, which may be inferred from a given biological conceptual data schema and the corresponding function manipulations. We implement and evaluate these stored functions into an actual RDBMS for both efficacy and efficiency. We show that it is possible to enforce basic and complex requirements for the genomic domain. We claim that the well-established relational text data type in RDBMS may appropriately handle the representation and persistency of biological sequences. We base our approach on the idea of domain-specific abstract data types that can store data with semantically defined functions while hiding those details from non-technical end-users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.