We exhibit an approximate equivalence between the Lasso estimator and Dantzig selector. For both methods we derive parallel oracle inequalities for the prediction risk in the general nonparametric regression model, as well as bounds on the $\ell_p$ estimation loss for $1\le p\le 2$ in the linear model when the number of variables can be much larger than the sample size.Comment: Noramlization factor correcte
convergence to the limit is not uniform. Furthermore, bootstrap and even subsampling techniques are plagued by noncontinuity of limiting distributions. Nevertheless, in the low-dimensional setting, a modified bootstrap scheme has been proposed; [13] and [14] have recently proposed a residual based bootstrap scheme. They provide consistency guarantees for the highdimensional setting; we consider this method in an empirical analysis in Section 4.Some approaches for quantifying uncertainty include the following. The work in [50] implicitly contains the idea of sample splitting and corresponding construction of p-values and confidence intervals, and the procedure has been improved by using multiple sample splitting and aggregation of dependent p-values from multiple sample splits [32]. Stability selection [31] and its modification [41] provides another route to estimate error measures for false positive selections in general high-dimensional settings. An alternative method for obtaining confidence sets is in the recent work [29]. From another and mainly theoretical perspective, the work in [24] presents necessary and sufficient conditions for recovery with the lassoβ in terms of β − β 0 ∞ , where β 0 denotes the true parameter: bounds on the latter, which hold with probability at least say 1 − α, could be used in principle to construct (very) conservative confidence regions. At a theoretical level, the paper [35] derives confidence intervals in ℓ 2 for the case of two possible sparsity levels. Other recent work is discussed in Section 1.1 below.We propose here a method which enjoys optimality properties when making assumptions on the sparsity and design matrix of the model. For a linear model, the procedure is as the one in [52] and closely related to the method in [23]. It is based on the lasso and is "inverting" the corresponding KKT conditions. This yields a nonsparse estimator which has a Gaussian (limiting) distribution. We show, within a sparse linear model setting, that the estimator is optimal in the sense that it reaches the semiparametric efficiency bound. The procedure can be used and is analyzed for high-dimensional sparse linear and generalized linear models and for regression problems with general convex (robust) loss functions.1.1. Related work. Our work is closest to [52] who proposed the semiparametric approach for distributional inference in a high-dimensional linear model. We take here a slightly different view-point, namely by inverting the KKT conditions from the lasso, while relaxed projections are used in [52]. Furthermore, our paper extends the results in [52] by: (i) treating generalized linear models and general convex loss functions; (ii) for linear models, we give conditions under which the procedure achieves the semiparametric efficiency bound and our analysis allows for rather general Gaussian, sub-Gaussian and bounded design. A related approach as in [52] was proposed CONFIDENCE REGIONS FOR HIGH-DIMENSIONAL MODELS 3 in [8] based on ridge regression which is clearly suboptimal and ineffi...
We argue, that due to the curse of dimensionality, there are major difficulties with any pure or smoothed likelihood-based method of inference in designed studies with randomly missing data when missingness depends on a high-dimensional vector of variables. We study in detail a semi-parametric superpopulation version of continuously stratified random sampling. We show that all estimators of the population mean that are uniformly consistent or that achieve an algebraic rate of convergence, no matter how slow, require the use of the selection (randomization) probabilities. We argue that, in contrast to likelihood methods which ignore these probabilities, inverse selection probability weighted estimators continue to perform well achieving uniform n 1/2-rates of convergence. We propose a curse of dimensionality appropriate (CODA) asymptotic theory for inference in non- and semi-parametric models in an attempt to formalize our arguments. We discuss whether our results constitute a fatal blow to the likelihood principle and study the attitude toward these that a committed subjective Bayesian would adopt. Finally, we apply our CODA theory to analyse the effect of the 'curse of dimensionality' in several interesting semi-parametric models, including a model for a two-armed randomized trial with randomization probabilities depending on a vector of continuous pretreatment covariates X. We provide substantive settings under which a subjective Bayesian would ignore the randomization probabilities in analysing the trial data. We then show that any statistician who ignores the randomization probabilities is unable to construct nominal 95 per cent confidence intervals for the true treatment effect that have both: (i) an expected length which goes to zero with increasing sample size; and (ii) a guaranteed expected actual coverage rate of at least 95 per cent over the ensemble of trials analysed by the statistician during his or her lifetime. However, we derive a new interval estimator, depending on the Randomization probabilities, that satisfies (i) and (ii).
We consider hidden Markov models indexed by a binary tree where the hidden state space is a general metric space. We study the maximum likelihood estimator (MLE) of the model parameters based only on the observed variables. In both stationary and non-stationary regimes, we prove strong consistency and asymptotic normality of the MLE under standard assumptions. Those standard assumptions imply uniform exponential memorylessness properties of the initial distribution conditional on the observations. The proofs rely on ergodic theorems for Markov chain indexed by trees with neighborhood-dependent functions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.