Generative Adversarial networks (GANs) have obtained remarkable success in many unsupervised learning tasks and unarguably, clustering is an important unsupervised learning problem. While one can potentially exploit the latentspace back-projection in GANs to cluster, we demonstrate that the cluster structure is not retained in the GAN latent space. In this paper, we propose ClusterGAN as a new mechanism for clustering using GANs. By sampling latent variables from a mixture of one-hot encoded variables and continuous latent variables, coupled with an inverse network (which projects the data to the latent space) trained jointly with a clustering specific loss, we are able to achieve clustering in the latent space. Our results show a remarkable phenomenon that GANs can preserve latent space interpolation across categories, even though the discriminator is never exposed to such vectors. We compare our results with various clustering baselines and demonstrate superior performance on both synthetic and real datasets. 1
We consider finite state channels where the state of the channel is its previous output. We refer to these as POST (Previous Output is the STate) channels. We first focus on POST(α) channels. These channels have binary inputs and outputs, where the state determines if the channel behaves as a Z or an S channel, both with parameter α.We show that the non feedback capacity of the POST(α) channel equals its feedback capacity, despite the memory of the channel. The proof of this surprising result is based on showing that the induced output distribution, when maximizing the directed information in the presence of feedback, can also be achieved by an input distribution that does not utilize of the feedback. We show that this is a sufficient condition for the feedback capacity to equal the non feedback capacity for any finite state channel. We show that the result carries over from the POST(α) channel to a binary POST channel where the previous output determines whether the current channel will be binary with parameters (a, b) or (b, a). Finally, we show that, in general, feedback may increase the capacity of a POST channel.
BackgroundNext Generation Sequencing technologies have revolutionized many fields in biology by reducing the time and cost required for sequencing. As a result, large amounts of sequencing data are being generated. A typical sequencing data file may occupy tens or even hundreds of gigabytes of disk space, prohibitively large for many users. This data consists of both the nucleotide sequences and per-base quality scores that indicate the level of confidence in the readout of these sequences. Quality scores account for about half of the required disk space in the commonly used FASTQ format (before compression), and therefore the compression of the quality scores can significantly reduce storage requirements and speed up analysis and transmission of sequencing data.ResultsIn this paper, we present a new scheme for the lossy compression of the quality scores, to address the problem of storage. Our framework allows the user to specify the rate (bits per quality score) prior to compression, independent of the data to be compressed. Our algorithm can work at any rate, unlike other lossy compression algorithms. We envisage our algorithm as being part of a more general compression scheme that works with the entire FASTQ file. Numerical experiments show that we can achieve a better mean squared error (MSE) for small rates (bits per quality score) than other lossy compression schemes. For the organism PhiX, whose assembled genome is known and assumed to be correct, we show that it is possible to achieve a significant reduction in size with little compromise in performance on downstream applications (e.g., alignment).ConclusionsQualComp is an open source software package, written in C and freely available for download at https://sourceforge.net/projects/qualcomp.
Opportunistic scheduling is a key mechanism for improving the performance of wireless systems. However, this mechanism requires that transmitters are aware of channel conditions (or CSI, Channel State Information) to the various possible receivers. CSI is not automatically available at the transmitters, rather it has to be acquired. Acquiring CSI consumes resources, and only the remaining resources can be used for actual data transmissions. We explore the resulting trade-off between acquiring CSI and exploiting channel diversity to the various receivers. Specifically, we consider a system consisting of a transmitter and a fixed number of receivers/users. An infinite buffer is associated to each receiver, and packets arrive in this buffer according to some stochastic process with fixed intensity. We study the impact of limited channel information on the stability of the system. We characterize its stability region, and show that an adaptive queue length-based policy can achieve stability whenever doing so is possible. We formulate a Markov Decision Process problem to characterize this queue lengthbased policy. In certain specific and yet relevant cases, we explicitly compute the optimal policy. In general case, we provide a scheduling policy that achieves a fixed fraction of the system's stability region. Scheduling with limited information is a problem that naturally arises in cognitive radio systems, and our results can be used in these systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.