This research developed and tested a filter algorithm that serves to reduce the feature space in healthcare datasets. The algorithm binarizes the dataset, and then separately evaluates the risk ratio of each predictor with the response, and outputs ratios that represent the association between a predictor and the class attribute. The value of the association translates to the importance rank of the corresponding predictor in determining the outcome. Using Random Forest and Logistic regression classification, the performance of the developed algorithm was compared against the regsubsets and varImp functions, which are unsupervised methods of variable selection. Equally, the proposed algorithm was compared with the supervised Fisher score and Pearson’s correlation feature selection methods. Different datasets were used for the experiment, and, in the majority of the cases, the predictors selected by the new algorithm outperformed those selected by the existing algorithms. The proposed filter algorithm is therefore a reliable alternative for variable ranking in data mining classification tasks with a dichotomous response.
In this research, the logistic regression model was employed to develop a classifier that measures psychological capital of workers in organization. Psychological capital (PsyCap) is the positive state of an individual, comprising of self‐efficacy, optimism, hope, and resilience. Employees with high psychological capital contribute positively to objectives and business strategy of an organization. An experimental dataset comprising of the psychological capital information of 329 employees in an organization was used to fit a data mining classification model. To ensure model accuracy, 220 observations were used as training set, whereas 109 were set aside to validate the model. Various statistical tests for goodness of fit and predictive accuracy were deployed to test model performance. The model has the ability to classify an individual's psychological capital into either high or low class with a predictive accuracy of 93%. The classification model is expected to serve as a tool in human resource management when measuring psychological capital of employees during recruitment interviews and promotion appraisals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.