Asxl1 loss–induced myeloid malignancies are mediated partially through impaired ASXL1 cohesin interaction and cohesin functions.
Stochastic recurrent neural networks with latent random variables of complex dependency structures have shown to be more successful in modeling sequential data than deterministic deep models. However, the majority of existing methods have limited expressive power due to the Gaussian assumption of latent variables. In this paper, we advocate learning implicit latent representations using semi-implicit variational inference to further increase model flexibility. Semi-implicit stochastic recurrent neural network (SIS-RNN) is developed to enrich inferred model posteriors that may have no analytic density functions, as long as independent random samples can be generated via reparameterization. Extensive experiments in different tasks on real-world datasets show that SIS-RNN outperforms the existing methods. IntroductionDeep auto-regressive models, such as recurrent neural networks (RNNs), are widely used for modeling sequential data due to their effective representation of long-term dependencies [1, 2, 3, 4, ?, 5, 6]. It has been shown that inducing uncertainty in hidden states of deep auto-regressive models could drastically improve their performance in many applications such as speech modeling, text generation, sequential image modeling and dynamic graph representation learning [7,8,9,10,11,12,13,14,15]. These methods integrate the variational auto-encoder (VAE) framework with deep auto-regressive models to infer stochastic latent variables, which can capture higher-level semantic abstraction (e.g. objects, speakers, or graph modules/communities) from the observed variables in a sequence (e.g. pixels, sound-waves, or partially observed dynamic graphs).Existing stochastic recurrent models, while having different encoder and decoder structures, have restricted expressive power due to the commonly adopted Gaussian assumption on prior and posterior distributions of latent variables. The Gaussian assumption has a well-known issue in underestimating the variance of the posterior [16], which can be further amplified by mean field variational inference (MFVI). This issue is often attributed to two key factors: 1) the mismatch between the restricted representation power of the variational family Q and the complexity of the posterior to be approximated by Q; 2) the use of KL divergence, which is an asymmetric measure for the distance between Q and the posterior [17,18].In this paper, we break the Gaussian assumption and propose a semi-implicit stochastic recurrent neural network (SIS-RNN) that is capable of inferring implicit posteriors for sequential data while maintaining simple optimization. Inspired by semi-implicit variational inference (SIVI) [17], we impose a semi-implicit hierarchical construction on a backbone RNN to represent the posterior distribution of stochastic recurrent layers. SIVI enables a flexible (implicit) mixing distribution for variational inference of our proposed SIS-RNN. As a result, even if the marginal of the hierarchy is not tractable, its density can be evaluated by Monte Carlo estimation. Our p...
Background: Single-cell gene expression measurements offer opportunities in deriving mechanistic understanding of complex diseases, including cancer. However, due to the complex regulatory machinery of the cell, gene regulatory network (GRN) model inference based on such data still manifests significant uncertainty.Results: The goal of this paper is to develop optimal classification of single-cell trajectories accounting for potential model uncertainty. Partially-observed Boolean dynamical systems (POBDS) are used for modeling gene regulatory networks observed through noisy gene-expression data. We derive the exact optimal Bayesian classifier (OBC) for binary classification of single-cell trajectories. The application of the OBC becomes impractical for large GRNs, due to computational and memory requirements. To address this, we introduce a particle-based single-cell classification method that is highly scalable for large GRNs with much lower complexity than the optimal solution.Conclusion: The performance of the proposed particle-based method is demonstrated through numerical experiments using a POBDS model of the well-known T-cell large granular lymphocyte (T-LGL) leukemia network with noisy time-series gene-expression data.
Next-generation sequencing (NGS) to profile temporal changes in living systems is gaining more attention for deriving better insights into the underlying biological mechanisms compared to traditional static sequencing experiments. Nonetheless, the majority of existing statistical tools for analyzing NGS data lack the capability of exploiting the richer information embedded in temporal data. Several recent tools have been developed to analyze such data but they typically impose strict model assumptions, such as smoothness on gene expression dynamic changes. To capture a broader range of gene expression dynamic patterns, we develop the gamma Markov negative binomial (GMNB) model that integrates a gamma Markov chain into a negative binomial distribution model, allowing flexible temporal variation in NGS count data. Using Bayes factors, GMNB enables more powerful
Background Single-cell gene expression measurements offer opportunities in deriving mechanistic understanding of complex diseases, including cancer. However, due to the complex regulatory machinery of the cell, gene regulatory network (GRN) model inference based on such data still manifests significant uncertainty. Results The goal of this paper is to develop optimal classification of single-cell trajectories accounting for potential model uncertainty. Partially-observed Boolean dynamical systems (POBDS) are used for modeling gene regulatory networks observed through noisy gene-expression data. We derive the exact optimal Bayesian classifier (OBC) for binary classification of single-cell trajectories. The application of the OBC becomes impractical for large GRNs, due to computational and memory requirements. To address this, we introduce a particle-based single-cell classification method that is highly scalable for large GRNs with much lower complexity than the optimal solution. Conclusion The performance of the proposed particle-based method is demonstrated through numerical experiments using a POBDS model of the well-known T-cell large granular lymphocyte (T-LGL) leukemia network with noisy time-series gene-expression data. Electronic supplementary material The online version of this article (10.1186/s12864-019-5720-3) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.