We observe a length-n sample generated by an unknown, stationary ergodic Markov process (model) over a finite alphabet A. Given any string w of symbols from A we want estimates of the conditional probability distribution of symbols following w, as well as the stationary probability of w. Two distinct problems that complicate estimation in this setting are (i) long memory, and (ii) slow mixing which could happen even with only one bit of memory.Any consistent estimator in this setting can only converge pointwise over the class of all ergodic Markov models. Namely, given any estimator and any sample size n, the underlying model could be such that the estimator performs poorly on a sample of size n with high probability. But can we look at a length-n sample and identify if an estimate is likely to be accurate?Since the memory is unknown a-priori, a natural approach is to estimate a potentially coarser model with memory kn = O(log n).As n grows, pointwise consistent estimates that hold eventually almost surely (e.a.s.) are known so long as the scaling of kn is not superlogarithmic in n. Here, rather than e.a.s. convergence results, we want the best answers possible with a length-n sample. Combining results in universal compression with Aldous' coupling arguments, we obtain sufficient conditions on the lengthn sample (even for slow mixing models) to identify when naive (i) estimates of the conditional probabilities and (ii) estimates related to the stationary probabilities are accurate; and also bound the deviations of the naive estimates from true values.
We consider estimation of binary channels with memory where the transition probabilities (channel parameters) from the input to output are determined by prior outputs (state of the channel). While the channel is unknown, we observe the joint input/output process of the channel-we have n i.i.d. input bits and their corresponding outputs. Motivated by applications related to the backplane channel, we want to estimate the channel parameters as well as the stationary probabilities for each state.Two distinct problems complicate estimation in this setting: (i) long memory, and (ii) slow mixing which could happen even with only one bit of memory. In this setting, any consistent estimator can only converge pointwise over the model class. Namely, given any estimator and any sample size n, the underlying model could be such that the estimator performs poorly on a sample of size n with high probability. But can we look at a length-n sample and identify if an estimate is likely to be accurate?Since the memory is unknown a-priori, a natural approach, known to be consistent, is to estimate a potentially coarser model with memory kn = αn log n, where αn is a function that grows O(1). Note however that (i) the coarser model is estimated using only samples from the true model; and (ii) we want the best possible answers with a length-n sample, rather than just consistency. Combining results on universal compression and Aldous' coupling arguments, we obtain sufficient conditions (even for slow mixing models) to identify when naive (i) estimates of the channel parameters and (ii) estimates related to the stationary probabilities of the channel states are accurate, and bound their deviations from true values.
The task of community detection in a graph formalizes the intuitive task of grouping together subsets of vertices such that vertices within clusters are connected tighter than those in disparate clusters. This paper approaches community detection in graphs by constructing Markov random walks on the graphs. The mixing properties of the random walk are then used to identify communities. We use coupling from the past as an algorithmic primitive to translate the mixing properties of the walk into revealing the community structure of the graph. We analyze the performance of our algorithms on specific graph structures, including the stochastic block models (SBM) and LFR random graphs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.