2019
DOI: 10.1002/sim.8063
|View full text |Cite
|
Sign up to set email alerts
|

Sample size considerations and predictive performance of multinomial logistic prediction models

Abstract: Multinomial Logistic Regression (MLR) has been advocated for developing clinical prediction models that distinguish between three or more unordered outcomes. We present a full‐factorial simulation study to examine the predictive performance of MLR models in relation to the relative size of outcome categories, number of predictors and the number of events per variable. It is shown that MLR estimated by Maximum Likelihood yields overfitted prediction models in small to medium sized data. In most cases, the calib… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
87
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 97 publications
(88 citation statements)
references
References 56 publications
0
87
0
1
Order By: Relevance
“…To select best goodness of fit, we compared the Bayesian Information Criteria (BIC) of the main effect model and with interaction variables included; the main effect model had a lower BIC indicating better fit. To reduce the risk of overfitting, events per variable (EPV) was also calculated based on the smallest number of observations in the outcome categories divided by the number of effective regression coefficients (De Jong et al, 2019). Multivariable multinomial logistic regression analyses were then performed to determine whether the strength of data sharing policies can be predicted from the aforementioned factors.…”
Section: Discussionmentioning
confidence: 99%
“…To select best goodness of fit, we compared the Bayesian Information Criteria (BIC) of the main effect model and with interaction variables included; the main effect model had a lower BIC indicating better fit. To reduce the risk of overfitting, events per variable (EPV) was also calculated based on the smallest number of observations in the outcome categories divided by the number of effective regression coefficients (De Jong et al, 2019). Multivariable multinomial logistic regression analyses were then performed to determine whether the strength of data sharing policies can be predicted from the aforementioned factors.…”
Section: Discussionmentioning
confidence: 99%
“…In this article we provide practical guidance for calculating the sample size required for the development of clinical prediction models, which builds on our recent methodology papers 1314151618. We suggest that current minimum sample size rules of thumb are too simplistic and outline a more scientific approach that tailors sample size requirements to the specific setting of interest.…”
mentioning
confidence: 99%
“…The authors took the investigation further by comparing the baseline category model to a logistic regression model fitted to dichotomized ordinal responses which demonstrated that the baseline category model was a superior fit. At least 50 multinomial events per variable was recommended leading to the MLR predictive performance gradually improving as the number of multinomial events per variable increases [22]. Our study results show that this could be the possible reason for the MLR model estimated by maximum likelihood being the most unlikely choice among the 3 predictive mechanisms.…”
Section: Application To Surgically-treated Cervical Cancer Patientsmentioning
confidence: 70%