2009
DOI: 10.1177/0013164409344533
|View full text |Cite
|
Sign up to set email alerts
|

The Effects of Initially Misclassified Data on the Effectiveness of Discriminant Function Analysis and Finite Mixture Modeling

Abstract: Classification procedures are common and useful in behavioral, educational, social, and managerial research. Supervised classification techniques such as discriminant function analysis assume training data are perfectly classified when estimating parameters or classifying. In contrast, unsupervised classification techniques such as finite mixture models (FMM) do not require, or even use if available, knowledge of group status to estimate parameters or classifying. This study investigates the impact of two type… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
11
0

Year Published

2011
2011
2020
2020

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(13 citation statements)
references
References 34 publications
2
11
0
Order By: Relevance
“…Indeed, several studies have shown that systematic misclassification in the two group case is detrimental to classification accuracy of traditional supervised classification methods (i.e., Discriminant analysis) (Lachenbruch, 1966, 1974, 1979; McLachlan, 1972; Chhikara and McKeon, 1984; Holden and Kelley, 2010). There is little research, however, investigating the impact of systematic training data misclassification when three true groups are present, or for newer classification and data mining techniques.…”
Section: Discussion Of Prediction Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Indeed, several studies have shown that systematic misclassification in the two group case is detrimental to classification accuracy of traditional supervised classification methods (i.e., Discriminant analysis) (Lachenbruch, 1966, 1974, 1979; McLachlan, 1972; Chhikara and McKeon, 1984; Holden and Kelley, 2010). There is little research, however, investigating the impact of systematic training data misclassification when three true groups are present, or for newer classification and data mining techniques.…”
Section: Discussion Of Prediction Methodsmentioning
confidence: 99%
“…As has been demonstrated previously, statistical classification methods are greatly affected by the characteristics of the data under study. Previous research indicates classification accuracy generally increases with increased sample size (Holden and Kelley, 2010 ; Holden et al, 2011 ; Pai et al, 2012 ), discrepancy in group size (Lei and Koehly, 2003 ; deCraen et al, 2006 ; Holden and Kelley, 2010 ; Holden et al, 2011 ), group separation (Blashfield, 1976 ; Lei and Koehly, 2003 ; Holden and Kelley, 2010 ; Holden et al, 2011 ), and number of variables used in the classification (Breckenridge, 2000 ). Assumption violations (Lei and Koehly, 2003 ; Rausch and Kelley, 2009 ), outliers and presence of multicollinearity (Pai et al, 2012 ) generally lead to decreased classification accuracy.…”
Section: Introductionmentioning
confidence: 96%
“…However, when the accuracy of the training data is in question, the accuracy of supervised classification methods is debatable. Authors have previously discussed the accuracy of classification with misclassified training data for discriminant function methods and LR (Chhikara & McKeon, 1984;Grayson, 1987;Holden & Kelley, 2010;Lachenbruch, 1966Lachenbruch, , 1974Lachenbruch, , 1979Lei & Koehly, 2003;McLachlan, 1972); however, the topic of training data misclassification has yet to be studied for these newer classification methods. The results of the current study indicate that alternative methods of classification, particularly CART, may provide better classification accuracy, especially for complex models.…”
Section: Directions For Future Researchmentioning
confidence: 99%
“…In the unequal scenario for two groups, the ratio was 75/25, while with three groups the unequal ratio was 60/20/20. Previous research (e.g., Holden and Kelley, 2010 ; Holden et al, 2011 ) found that unequal group sizes had an impact on performance of group prediction methods, and therefore was included in the current study.…”
Section: Methodsmentioning
confidence: 99%