2017
DOI: 10.1111/coin.12123
|View full text |Cite
|
Sign up to set email alerts
|

Logistic regression in large rare events and imbalanced data: A performance comparison of prior correction and weighting methods

Abstract: The purpose of this study is to use the truncated Newton method in prior correction logistic regression (LR). A regularization term is added to prior correction LR to improve its performance, which results in the truncated‐regularized prior correction algorithm. The performance of this algorithm is compared with that of weighted LR and the regular LR methods for large imbalanced binary class data sets. The results, based on the KDD99 intrusion detection data set, and 6 other data sets at both the prior correct… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
17
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 32 publications
(18 citation statements)
references
References 38 publications
1
17
0
Order By: Relevance
“…Although including intraoperative adverse events and conversion to open surgery may improve the accuracy of the prediction models, such models would not be useful in the preoperative assessment for patients or for case mix comparisons. Furthermore, because the algorithms try to minimize the total error rate out of all classes, irrespective of which class the errors come from, they are not appropriate for imbalanced data, such as what we used in our study [44].…”
Section: Discussionmentioning
confidence: 99%
“…Although including intraoperative adverse events and conversion to open surgery may improve the accuracy of the prediction models, such models would not be useful in the preoperative assessment for patients or for case mix comparisons. Furthermore, because the algorithms try to minimize the total error rate out of all classes, irrespective of which class the errors come from, they are not appropriate for imbalanced data, such as what we used in our study [44].…”
Section: Discussionmentioning
confidence: 99%
“…They determined that their RankRC method was able to outperform several SVM methods and was more efficient with processing speed and space required. Maalouf et al [44] present a truncated Newton method in prior correction logistic regression (LR) including an additional regularization term to improve performance. They also employ the KDD Cup 1999 dataset, along with six others.…”
Section: Related Workmentioning
confidence: 99%
“…Indeed, the higherorder moments and the variance of heavy-tailed distributions are not well-defined, and statistical methods based on assumptions of bounded variance leads to biased estimates. The literature on heavy-tail regression problem has developed methods based on prior correction or weighing data points [28,29]. However, most regression methods show limited performance in learning non-linear decision boundaries and underpredict highselling books.…”
Section: Learning Algorithmsmentioning
confidence: 99%