Mixing time estimation in reversible Markov chains from a single sample path

Hsu, Daniel; Kontorovich, Aryeh; Levin, David A.; Peres, Yuval; Szepesvári, Csaba; Wolfer, Geoffrey

doi:10.1214/18-aap1457

Cited by 24 publications

(44 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Furthermore, this theorem allows γ to approach to 1 as n increases, so long as s (1 + γ)/(1 − γ) · log d/n → 0. Even though the spectral gap of the Markov chain is difficult to accurately compute in practice (Hsu, Kontorovich and Szepesvári, 2015), Theorem 1 also apply if one replaces γ with an inaccurate overestimate γ ′ ≥ γ.…”

Section: Assumptions and Theoremsmentioning

confidence: 99%

Adaptive Huber regression on Markov-dependent data

Fan

Guo

Jiang

2022

Stochastic Processes and their Applications

View full text Add to dashboard Cite

High-dimensional linear regression has been intensively studied in the community of statistics in the last two decades. For the convenience of theoretical analyses, classical methods usually assume independent observations and sub-Gaussian-tailed errors. However, neither of them hold in many real highdimensional time-series data. Recently [Sun, Zhou, Fan, 2019, J. Amer. Stat. Assoc., in press] proposed Adaptive Huber Regression (AHR) to address the issue of heavy-tailed errors. They discover that the robustification parameter of the Huber loss should adapt to the sample size, the dimensionality, and the moments of the heavy-tailed errors. We progress in a vertical direction and justify AHR on dependent observations. Specifically, we consider an important dependence structure -Markov dependence. Our results show that the Markov dependence impacts on the adaption of the robustification parameter and the estimation of regression coefficients in the way that the sample size should be discounted by a factor depending on the spectral gap of the underlying Markov chain.

show abstract

Section: Assumptions and Theoremsmentioning

confidence: 99%

Adaptive Huber regression on Markov-dependent data

Fan

Guo

Jiang

2022

Stochastic Processes and their Applications

View full text Add to dashboard Cite

show abstract

“…Their method requires time O(n + |Ω| 3 ) and space O(|Ω| 2 ), and is hence mainly adapted to state spaces of moderate size. Our contribution is therefore complementary to [4], which is mainly concerned with statistical efficiency for moderate Ω while we are interested computationally efficient approaches for large Ω.…”

Section: Black-box Methodsmentioning

confidence: 99%

“…One is not allowed to choose the starting state X 0 . This is the model considered in [4]. Furthermore, since a sample path of length n can be generated by n calls to NextState, estimation in the USP model is more arduous than in the RTF model.…”

Section: The Unique Sample Path (Usp) Modelmentioning

confidence: 99%

“…To get an estimation error of , UCPI requires at most n 1 = O( − 2 1−r |Ω| r+1 r−1 ) samples, time O(n 1 ) and memory O((ln n 1 ) 2 ). By comparison, the method of [4] requires at least n 2 = O(( π m (1 − λ )) −2 ) samples, with time O(n 2 + |Ω| 3 ) and memory O(|Ω| 2 ), where π m = min x∈Ω π(x). It is noted that (π m ) −1 > |Ω|, and that π m may be arbitrarily small, so that n 2 is not necessarily smaller than n 1 .…”

Section: Computational Complexitymentioning

confidence: 99%

“…Finally, as shown by [4], in order to reach any fixed accuracy, n should satisfy n ≥ O( 1 πm ) ≥ O(|Ω|). So it is not possible to design an algorithm whose sample complexity is sub-linear in the state space size.…”

Section: Computational Complexitymentioning

confidence: 99%

See 2 more Smart Citations

Computationally Efficient Estimation of the Spectral Gap of a Markov Chain

Combes

Touati

2019

Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

View full text Add to dashboard Cite

We consider the problem of estimating from sample paths the absolute spectral gap 1 − λ of a reversible, irreducible and aperiodic Markov chain (X t ) t∈N over a finite state space Ω. We propose the UCPI (Upper Confidence Power Iteration) algorithm for this problem, a low-complexity algorithm which estimates the spectral gap in time O(n) and memory space O((ln n) 2 ) given n samples. This is in stark contrast with most known methods which require at least memory space O(|Ω|), so that they cannot be applied to large state spaces. We also analyze how n should scale to reach a target estimation error. Furthermore, UCPI is amenable to parallel implementation. †: Centrale-Supelec,We make the following contributions (a) We propose UCPI (Upper Confidence Power Iteration), a computationally efficient algorithm to estimate the absolute spectral gap in time O(n) and memory space O((ln n) 2 ) given n samples. We analyze how n should scale to reach a target estimation error in section 4.4. (b) We prove that UCPI is consistent and analyze its convergence rate as a function of the number of samples. (c) We show how UCPI is applicable to a broad set of assumptions e.g. the case where a single sample path is available, the case where one can simulate transitions of the chain etc. Markov chains over large state spacesMarkov chains are typically encountered in the following setting. Consider π the stationary distribution of P and f : Ω → [0, 1] some function. We would like to compute the quantity:If Ω is large, even if π were known, summing over all elements of x ∈ Ω might not be feasible and one may instead use a simulation method by drawing sample paths of (X t ) t starting from some arbitrary initial distribution, since:So we may compute Z by drawing many independent sample paths of length k with k large enough. However, to select the sample path length k properly, one needs information about the mixing properties (i.e. the convergence

show abstract

Information Geometry of Reversible Markov Chains

Wolfer

Watanabe

2021

Info. Geo.

Self Cite

View full text Add to dashboard Cite

We analyze the information geometric structure of time reversibility for parametric families of irreducible transition kernels of Markov chains. We define and characterize reversible exponential families of Markov kernels, and show that irreducible and reversible Markov kernels form both a mixture family and, perhaps surprisingly, an exponential family in the set of all stochastic kernels. We propose a parametrization of the entire manifold of reversible kernels, and inspect reversible geodesics. We define information projections onto the reversible manifold, and derive closed-form expressions for the e-projection and m-projection, along with Pythagorean identities with respect to information divergence, leading to some new notion of reversiblization of Markov kernels. We show the family of edge measures pertaining to irreducible and reversible kernels also forms an exponential family among distributions over pairs. We further explore geometric properties of the reversible family, by comparing them with other remarkable families of stochastic matrices. Finally, we show that reversible kernels are, in a sense we define, the minimal exponential family generated by the m-family of symmetric kernels, and the smallest mixture family that comprises the e-family of memoryless kernels.

show abstract

Mixing time estimation in reversible Markov chains from a single sample path

Cited by 24 publications

References 48 publications

Adaptive Huber regression on Markov-dependent data

Adaptive Huber regression on Markov-dependent data

Computationally Efficient Estimation of the Spectral Gap of a Markov Chain

Information Geometry of Reversible Markov Chains

Contact Info

Product

Resources

About