2020
DOI: 10.3150/19-bej1126
|View full text |Cite
|
Sign up to set email alerts
|

Prediction and estimation consistency of sparse multi-class penalized optimal scoring

Abstract: Sparse linear discriminant analysis via penalized optimal scoring is a successful tool for classification in high-dimensional settings. While the variable selection consistency of sparse optimal scoring has been established, the corresponding prediction and estimation consistency results have been lacking. We bridge this gap by providing probabilistic bounds on out-of-sample prediction error and estimation error of multi-class penalized optimal scoring allowing for diverging number of classes.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 36 publications
0
5
0
Order By: Relevance
“…, 1994) of multi‐class discriminant analysis (Gaynanova et al. , 2016; Gaynanova, 2020) for view d : minimizeWdRp×false(K1false)true{12nfalse∥trueYXdWdfalse∥F2goodbreak+λprefixPen(bold-italicWd)true},$$\begin{equation} \operatornamewithlimits{minimize}_{\bm{W}_d \in \mathbb {R}^{p \times (K-1)}} \Big \lbrace \frac{1}{2n}\Vert \widetilde{\bm{Y}}- \bm{X}_d\bm{W}_d\Vert ^2_F + \lambda \operatornamewithlimits{Pen}(\bm{W}_d)\Big \rbrace , \end{equation}$$where prefixPenfalse(Wdfalse)$\operatornamewithlimits{Pen}(\bm{W}_d)$ is an optional penalty used to put structural assumptions such as sparsity, and trueYRn×false(K1false)$\widetilde{\bm{Y}}\in \mathbb {R}^{n \times (K-1)}$ is the transformed class response. Let bold-italicZRn×K$\bm{Z}\in \mathbb {R}^{n \times K}$ be the class‐indicator matrix, nk$n_k$ be the number of samples in class k and sk=i=1kni$s_k = \sum _{i=1}^k n_i$.…”
Section: Proposed Methodologymentioning
confidence: 99%
See 1 more Smart Citation
“…, 1994) of multi‐class discriminant analysis (Gaynanova et al. , 2016; Gaynanova, 2020) for view d : minimizeWdRp×false(K1false)true{12nfalse∥trueYXdWdfalse∥F2goodbreak+λprefixPen(bold-italicWd)true},$$\begin{equation} \operatornamewithlimits{minimize}_{\bm{W}_d \in \mathbb {R}^{p \times (K-1)}} \Big \lbrace \frac{1}{2n}\Vert \widetilde{\bm{Y}}- \bm{X}_d\bm{W}_d\Vert ^2_F + \lambda \operatornamewithlimits{Pen}(\bm{W}_d)\Big \rbrace , \end{equation}$$where prefixPenfalse(Wdfalse)$\operatornamewithlimits{Pen}(\bm{W}_d)$ is an optional penalty used to put structural assumptions such as sparsity, and trueYRn×false(K1false)$\widetilde{\bm{Y}}\in \mathbb {R}^{n \times (K-1)}$ is the transformed class response. Let bold-italicZRn×K$\bm{Z}\in \mathbb {R}^{n \times K}$ be the class‐indicator matrix, nk$n_k$ be the number of samples in class k and sk=i=1kni$s_k = \sum _{i=1}^k n_i$.…”
Section: Proposed Methodologymentioning
confidence: 99%
“…Although estimation consistency has been established for LDA (Li and Jia, 2017; Gaynanova, 2020) and CCA (Gao et al. , 2017), providing similar guarantees for JACA is not straightforward.…”
Section: Introductionmentioning
confidence: 99%
“…The new term log(n 1 /n 2 ) converges to log(π 1 /π 2 ) with n −1/2 rate as a consequence of Hoeffding's inequality, see e.g. Lemma 11 in Gaynanova (2020).…”
Section: B Extension Of Theorem 1 To Unequal Class Sizesmentioning
confidence: 98%
“…This motivates the adoption of a least-squares approach to estimating β 0 , based on the empirical counterpart of the objective function in ( 7), where an additional differential regularization term is introduced to incorporate information on the geometric domain M and overcome the illposedness of the problem. In the multivariate setting, analogous least-squares formulations have also been adopted, for instance, in Mai, Zou, and Yuan (2012) and Gaynanova (2020).…”
Section: Discriminant Analysis On the Parametrizing Linear Function S...mentioning
confidence: 99%