2016
DOI: 10.1073/pnas.1616647113
|View full text |Cite
|
Sign up to set email alerts
|

Framework for making better predictions by directly estimating variables’ predictivity

Abstract: We propose approaching prediction from a framework grounded in the theoretical correct prediction rate of a variable set as a parameter of interest. This framework allows us to define a measure of predictivity that enables assessing variable sets for, preferably high, predictivity. We first define the prediction rate for a variable set and consider, and ultimately reject, the naive estimator, a statistic based on the observed sample data, due to its inflated bias for moderate sample size and its sensitivity to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
37
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 23 publications
(37 citation statements)
references
References 12 publications
0
37
0
Order By: Relevance
“…It is tempting to discuss this in terms of ecology and the association between canopy N and LAI; however, it is more likely a statistical problem. An explanation is that statistically significant explanatory variables (e.g., spectral bands) that have an association with a target variable might not necessarily carry the most predictive power, and the most predictive variables are not necessarily the most significant ones [57], [58]. A key distinction that makes a variable significant or predictive lies in the properties of their underlying distribution.…”
Section: A Variable Selection Is Sensitive To Transformation and Scalementioning
confidence: 99%
“…It is tempting to discuss this in terms of ecology and the association between canopy N and LAI; however, it is more likely a statistical problem. An explanation is that statistically significant explanatory variables (e.g., spectral bands) that have an association with a target variable might not necessarily carry the most predictive power, and the most predictive variables are not necessarily the most significant ones [57], [58]. A key distinction that makes a variable significant or predictive lies in the properties of their underlying distribution.…”
Section: A Variable Selection Is Sensitive To Transformation and Scalementioning
confidence: 99%
“…If − 1 variables are influential in the sense that any smaller subset of variables is less influential, then the removal of a variable to size − 2 will decrease the I-score. Thus, the I-score has a natural tendency to "peak" at variable set(s) that lead to high predictive power in the face of noisy variables under the current sample size [15]. For high-dimensional variable selection problem, one way to thin out the candidates, i.e., to reduce the search space is to apply the I-score to one explanatory variable at a time, and to focus on those which indicate strong marginal observable effects [14].…”
Section: Influential Scorementioning
confidence: 99%
“…It has also been shown, from a predictive modeling perspective, that test‐derived statistics are not reliable measures , even though they may have adequate power in detecting departures from independence (Lo et al, ). On the other hand, PR's I has been shown to correlate well with the actual extent of association between X variables and Y (in this case, class labels) (Lo, Chernoff, Zheng, & Lo, ), with the true association parameterized by how well X can predict Y .…”
Section: Measures Of Associationmentioning
confidence: 99%