This study applies response surface methodology (RSM) to the hyperparameter fine-tuning of three machine learning (ML) algorithms: artificial neural network (ANN), support vector machine (SVM), and deep belief network (DBN). The purpose is to demonstrate RSM effectiveness in maintaining ML algorithm performance while reducing the number of runs required to reach effective hyperparameter settings in comparison with the commonly used grid search (GS). The ML algorithms are applied to a case study dataset from a food producer in Thailand. The objective is to predict a raw material quality measured on a numerical scale. K-fold cross-validation is performed to ensure that the ML algorithm performance is robust to the data partitioning process in the training, validation, and testing sets. The mean absolute error (MAE) of the validation set is used as the prediction accuracy measurement. The reliability of the hyperparameter values from GS and RSM is evaluated using confirmation runs. Statistical analysis shows that (1) the prediction accuracy of the three ML algorithms tuned by GS and RSM is similar, (2) hyperparameter settings from GS are 80% reliable for ANN and DBN, and settings from RSM are 90% and 100% reliable for ANN and DBN, respectively, and (3) savings in the number of runs required by RSM over GS are 97.79%, 97.81%, and 80.69% for ANN, SVM, and DBN, respectively.
This paper presents a stratified random sampling plan for estimating accuracy of bill processing performance for the health care bills submitted to third party payers in health care systems. Bill processing accuracy is estimated with two measures: percent accuracy and total dollar accuracy. Difficulties in constructing a sampling plan arise when the population strata structure is unknown, and when the two measures require different sampling schemes. To efficiently utilize sample resource, the sampling plan is designed to effectively estimate both measures from the same sample. The sampling plan features a simple but efficient strata construction method, called rectangular method, and two accuracy estimation methods, one for each measure. The sampling plan is tested on actual populations from an insurance company. Accuracy estimates obtained are then used to compare the rectangular method to other potential clustering methods for strata construction, and compare the accuracy estimation methods to other eligible methods. Computational study results show effectiveness of the proposed sampling plan.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.