2019
DOI: 10.1007/s00440-018-00896-9
|View full text |Cite
|
Sign up to set email alerts
|

The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled Chi-square

Abstract: Logistic regression is used thousands of times a day to fit data, predict future outcomes, and assess the statistical significance of explanatory variables. When used for the purpose of statistical inference, logistic models produce p-values for the regression coefficients by using an approximation to the distribution of the likelihood-ratio test. Indeed, Wilks' theorem asserts that whenever we have a fixed number p of variables, twice the log-likelihood ratio (LLR) 2Λ is distributed as a χ 2 k variable in the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

1
66
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7

Relationship

3
4

Authors

Journals

citations
Cited by 95 publications
(67 citation statements)
references
References 57 publications
1
66
0
Order By: Relevance
“…On the other hand, our analysis framework is based on a leave-one-out perturbation argument. This technique has been widely used to analyze high-dimensional problems with random designs, including but not limited to robust M-estimation [44,45], statistical inference for sparse regression [62], likelihood ratio test in logistic regression [108], phase synchronization [1,133], ranking from pairwise comparisons [30], community recovery [1], and covariance sketching [79]. In particular, this technique results in tight performance guarantees for the generalized power method [133], the spectral method [1,30], and convex programming approaches [30,44,108,133]; however, it has not been applied to analyze nonconvex optimization algorithms.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…On the other hand, our analysis framework is based on a leave-one-out perturbation argument. This technique has been widely used to analyze high-dimensional problems with random designs, including but not limited to robust M-estimation [44,45], statistical inference for sparse regression [62], likelihood ratio test in logistic regression [108], phase synchronization [1,133], ranking from pairwise comparisons [30], community recovery [1], and covariance sketching [79]. In particular, this technique results in tight performance guarantees for the generalized power method [133], the spectral method [1,30], and convex programming approaches [30,44,108,133]; however, it has not been applied to analyze nonconvex optimization algorithms.…”
Section: Related Workmentioning
confidence: 99%
“…108) holds with probability at least 1 − O(mn −10 ) as long as m n log n. The 2 bound on x 0 − x 0,(l) …”
mentioning
confidence: 99%
“…Finally, the leave-one-out arguments have been invoked to analyze other high-dimensional statistical inference problems including robust M-estimators [EKBB + 13, EK15], and maximum likelihood theory for logistic regression [SCC18], etc. In addition, [ZB17, CFMW17, AFWZ17] made use of the leave-one-out trick to derive entrywise perturbation bounds for eigenvectors resulting from certain spectral methods.…”
Section: Related Workmentioning
confidence: 99%
“…Then Cover shows that as p and n grow large in such a way that p/n → κ, the data points asymptotically overlap-with probability tending to one-if κ < 1/2 whereas they are separated-also with probability tending to one-if κ > 1/2. In the former case where the MLE exists, [17] refined Cover's result by calculating the limiting distribution of the MLE when the features x i are Gaussian.…”
Section: Limitationsmentioning
confidence: 99%
“…Hence, the results from [5,6] and [17] describe a phase transition in the existence of the MLE as the dimensionality parameter κ = p/n varies around the value 1/2. Therefore, a natural question is this:…”
Section: Limitationsmentioning
confidence: 99%