2015
DOI: 10.3934/bdia.2016.1.129
|View full text |Cite
|
Sign up to set email alerts
|

On balancing between optimal and proportional categorical predictions

Abstract: A bias-variance dilemma in categorical data mining and analysis is the fact that a prediction method can aim at either maximizing the overall point-hit accuracy without constraint or with the constraint of minimizing the distribution bias. However, one can hardly achieve both at the same time. A scheme to balance these two prediction objectives is proposed in this article. An experiment with a real data set is conducted to demonstrate some of the scheme's characteristics. Some basic properties of the scheme ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
2
0

Year Published

2016
2016
2017
2017

Publication Types

Select...
4

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 23 publications
(25 reference statements)
0
2
0
Order By: Relevance
“…The prediction accuracy naturally attracts most of the attention and has been studied for hundreds of years. Categorical data analysis alone has the rate of pointhit accuracy, of distribution bias and of the balanced one between them [9]. Huang, Shi and Wang [12] suggested that the measure of association is fundamental to obtain the prediction accuracy rate and that this measure will increase as more explanatory variables added in that probabilistic model [12].…”
mentioning
confidence: 99%
“…The prediction accuracy naturally attracts most of the attention and has been studied for hundreds of years. Categorical data analysis alone has the rate of pointhit accuracy, of distribution bias and of the balanced one between them [9]. Huang, Shi and Wang [12] suggested that the measure of association is fundamental to obtain the prediction accuracy rate and that this measure will increase as more explanatory variables added in that probabilistic model [12].…”
mentioning
confidence: 99%
“…The choosing of a variance measure depends on the objective of the data analysis, the predictive model to be used, or even on the analyst's preference: the Gini, entropy and Chisquare are typical preferences. In this article, we choose the proportional and the modal prediction oriented variances, the GK-tau and GK-lambda, as in ( [11,12,13,8,18], for their statistical interpretability.…”
mentioning
confidence: 99%