Gene expression measurements represent the most important source of biological data used to unveil the interaction and functionality of genes. In this regard, several data mining and machine learning algorithms have been proposed that require, in a number of cases, some kind of data discretization to perform the inference. Selection of an appropriate discretization process has a major impact on the design and outcome of the inference algorithms, as there are a number of relevant issues that need to be considered. This study presents a revision of the current state-of-the-art discretization techniques, together with the key subjects that need to be considered when designing or selecting a discretization approach for gene expression data.
Eragrostis curvula presents mainly facultative genotypes that reproduce by diplosporous apomixis, retaining a percentage of sexual pistils that increase under drought and other stressful situations, indicating that some regulators activated by stress could be affecting the apomixis/sexual switch. Water stress experiments were performed in order to associate the increase in sexual embryo sacs with the differential expression of genes in a facultative apomictic cultivar using cytoembryology and RNA sequencing. The percentage of sexual embryo sacs increased from 4 to 24% and 501 out of the 201,011 transcripts were differentially expressed (DE) between control and stressed plants. DE transcripts were compared with previous transcriptomes where apomictic and sexual genotypes were contrasted. The results point as candidates to transcripts related to methylation, ubiquitination, hormone and signal transduction pathways, transcription regulation and cell wall biosynthesis, some acting as a general response to stress and some that are specific to the reproductive mode. We suggest that a DNA glycosylase EcROS1-like could be demethylating, thus de-repressing a gene or genes involved in the sexuality pathways. Many of the other DE transcripts could be part of a complex mechanism that regulates apomixis and sexuality in this grass, the ones in the intersection between control/stress and apo/sex being the strongest candidates.
The Poaceae constitute a taxon of flowering plants (grasses) that cover almost all Earth’s inhabitable range and comprises some of the genera most commonly used for human and animal nutrition. Many of these crops have been sequenced, like rice, Brachypodium, maize and, more recently, wheat. Some important members are still considered orphan crops, lacking a sequenced genome, but having important traits that make them attractive for sequencing. Among these traits is apomixis, clonal reproduction by seeds, present in some members of the Poaceae like Eragrostis curvula . A de novo , high-quality genome assembly and annotation for E . curvula have been obtained by sequencing 602 Mb of a diploid genotype using a strategy that combined long-read length sequencing with chromosome conformation capture. The scaffold N50 for this assembly was 43.41 Mb and the annotation yielded 56,469 genes. The availability of this genome assembly has allowed us to identify regions associated with forage quality and to develop strategies to sequence and assemble the complex tetraploid genotypes which harbor the apomixis control region(s). Understanding and subsequently manipulating the genetic drivers underlying apomixis could revolutionize agriculture.
BackgroundGene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes.ResultsThis paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2), which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations.ConclusionsA novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.