Identifying co-expressed gene clusters can provide evidence for genetic or physical interactions. Thus, co-expression clustering is a routine step in large-scale analyses of gene expression data. We show that commonly used clustering methods produce results that substantially disagree and that do not match the biological expectations of co-expressed gene clusters. We present clust, a method that solves these problems by extracting clusters matching the biological expectations of co-expressed genes and outperforms widely used methods. Additionally, clust can simultaneously cluster multiple datasets, enabling users to leverage the large quantity of public expression data for novel comparative analysis. Clust is available at https://github.com/BaselAbujamous/clust.Electronic supplementary materialThe online version of this article (10.1186/s13059-018-1536-8) contains supplementary material, which is available to authorized users.
Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM), which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM). The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.
This study proposes a novel approach for the analysis of brain responses in the modality of ongoing EEG elicited by the naturalistic and continuous music stimulus. The 512-second long EEG data (recorded with 64 electrodes) are first decomposed into 64 components by independent component analysis (ICA) for each participant. Then, the spatial maps showing dipolar brain activity are selected in terms of the residual dipole variance through a single dipole model in brain imaging, and clustered into a pre-defined number (estimated by the minimum description length) of clusters. Subsequently, the temporal courses of the EEG theta and alpha oscillations of each component for each cluster are produced and correlated with the temporal courses of tonal and rhythmic features of the music. Using this approach, we found that the extracted temporal courses of the theta and alpha oscillations along central and occipital area of scalp in two of the selected clusters significantly correlated with the musical features representing progressions in the rhythmic content of the stimulus. We suggest that this demonstrates that with the proposed approach, we have managed to discover what kinds of brain responses were elicited when Manuscript a participant was listening continuously to the long piece of naturalistic music.
BackgroundHuman-induced pluripotent stem cells (hiPSCs) are a potentially invaluable resource for regenerative medicine, including the in vitro manufacture of blood products. HiPSC-derived red blood cells are an attractive therapeutic option in hematology, yet exhibit unexplained proliferation and enucleation defects that presently preclude such applications. We hypothesised that substantial differential regulation of gene expression during erythroid development accounts for these important differences between hiPSC-derived cells and those from adult or cord-blood progenitors. We thus cultured erythroblasts from each source for transcriptomic analysis to investigate differential gene expression underlying these functional defects.ResultsOur high resolution transcriptional view of definitive erythropoiesis captures the regulation of genes relevant to cell-cycle control and confers statistical power to deploy novel bioinformatics methods. Whilst the dynamics of erythroid program elaboration from adult and cord blood progenitors were very similar, the emerging erythroid transcriptome in hiPSCs revealed radically different program elaboration compared to adult and cord blood cells. We explored the function of differentially expressed genes in hiPSC-specific clusters defined by our novel tunable clustering algorithms (SMART and Bi-CoPaM). HiPSCs show reduced expression of c-KIT and key erythroid transcription factors SOX6, MYB and BCL11A, strong HBZ-induction, and aberrant expression of genes involved in protein degradation, lysosomal clearance and cell-cycle regulation.ConclusionsTogether, these data suggest that hiPSC-derived cells may be specified to a primitive erythroid fate, and implies that definitive specification may more accurately reflect adult development. We have therefore identified, for the first time, distinct gene expression dynamics during erythroblast differentiation from hiPSCs which may cause reduced proliferation and enucleation of hiPSC-derived erythroid cells. The data suggest several mechanistic defects which may partially explain the observed aberrant erythroid differentiation from hiPSCs.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3134-z) contains supplementary material, which is available to authorized users.
Background Chemically inducible systems that provide both spatial and temporal control of gene expression are essential tools, with many applications in plant biology, yet they have not been extensively tested in monocotyledonous species. Results Using Golden Gate modular cloning, we have created a monocot-optimized dexamethasone (DEX)-inducible pOp6/LhGR system and tested its efficacy in rice using the reporter enzyme β-glucuronidase (GUS). The system is tightly regulated and highly sensitive to DEX application, with 6 h of induction sufficient to induce high levels of GUS activity in transgenic callus. In seedlings, GUS activity was detectable in the root after in vitro application of just 0.01 μM DEX. However, transgenic plants manifested severe developmental perturbations when grown on higher concentrations of DEX. The direct cause of these growth defects is not known, but the rice genome contains sequences with high similarity to the LhGR target sequence lacO, suggesting non-specific activation of endogenous genes by DEX induction. These off-target effects can be minimized by quenching with isopropyl β-D-1-thiogalactopyranoside (IPTG). Conclusions Our results demonstrate that the system is suitable for general use in rice, when the method of DEX application and relevant controls are tailored appropriately for each specific application.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.