Abstract:In many mixture-process experiments, restricted randomization occurs and split-plot designs are commonly employed to handle these situations. The objective of this study was to obtain an optimal split-plot design for performing a mixture-process experiment. A split-plot design composed of a combination of a simplex centroid design of three mixture components and a 2 2 factorial design for the process factors was assumed. Two alternative arrangements of design points in a split-plot design were compared. Design-Expert ® version 10 software was used to construct I-and D-optimal split-plot designs. This study employed A-, D-, and E-optimality criteria to compare the efficiency of the constructed designs and fraction of design space plots were used to evaluate the prediction properties of the two designs. The arrangement, where there were more subplots than whole-plots was found to be more efficient and to give more precise parameter estimates in terms of A-, D-and E-optimality criteria. The I-optimal split-plot design was preferred since it had the capacity for better prediction properties and precision in the measurement of the coefficients. We thus recommend the employment of split-plot designs in experiments involving mixture formulations to measure the interaction effects of both the mixture components and the processing conditions. In cases where precision of the results is more desirable on the mixtures as well as where the mixture blends are more than the sets of process conditions, we recommend that the mixture experiment be set up at each of the points of a factorial design. In situations where the interest is on prediction aspects of the system, we recommend the I-optimal split-plot design to be employed since it has low prediction variance in much of the design space and also gives reasonably precise parameter estimates.
The aim of this paper was to apply Principal Component Analysis (PCA) and hierarchical regression model on Kenyan Macroeconomic variables. The study adopted a mixed research design (descriptive and correlational research designs). The 18 macroeconomic variables data were extracted from Kenya National Bureau of Statistics and World Bank for the period 1970 to 2019. The R software was utilized to conduct all the data analysis. Principal Component Analysis was used to reduce the dimensionality of the data, where the original data set matrix was reduced to Eigenvectors and Eigenvalues. A hierarchical regression model was fitted on the extracted components, and R2 was used to determine whether the components were a good fit for predicting economic growth. The results from the study showed that the first component explained 73.605 % of the overall Variance and was highly correlated with 15 original variables. Additionally, the second principal component described approximately 10.03% of the total Variance, while the two variables had a higher positive loading into it. About 6.22% of the overall variance was explained by the third component, which was highly correlated with only one of the original variables. The first, second, and third models had F statistics of 2385.689, 1208.99, and 920.737, respectively, and each with a p-value of 0.0001<5% was hence implying that the models were significant. The third model had the lowest mean square error of 17.296 hence described as the best predictive model. Since component 1 had the highest Variance explained, and model 1 had a lower p-value than other models, Principal component 1 was more reliable in explaining economic growth. Therefore, it was concluded that the macroeconomic variables associated with the monetary economy, the trade and openness of the economy with government activities, the consumption factor of the economy, and the investment factor of the economy predict economic growth in Kenya. The study recommends that PCA should be utilized when dealing with more than 15 variables, and hierarchical regression model building technique be used to determine the partial variance change among the independent variables in regression modeling.
Financial institutions have a large amount of data on their borrowers, which can be used to predict the probability of borrowers defaulting their loan or not. Some of the models that have been used to predict individual loan defaults include linear discriminant analysis models and extreme value theory models. These models are parametric in nature since they assume that the response being investigated takes a particular functional form. However, there is a possibility that the functional form used to estimate the response is very different from the actual functional form of the response. The purpose of this research was to analyze individual loan defaults in Kenya using the logistic regression model. The data used in this study was obtained from equity bank of Kenya for the period between 2006 to 2016. A random sample of 1000 loan applicants whose loans had been approved by equity bank of Kenya during this period was obtained. Data obtained was on the credit history, purpose of the loan, loan amount, nature of the saving account, employment status, sex of the applicant, age of the applicant, security used when acquiring the loan and the area of residence of the applicant (rural or urban). This study employed a quantitative research design, it deals with individual loans defaults as group characteristics of a borrower. The data was pre-processed by seeding using R- Software and then split into training dataset and test data set. The train data was used to train the logistic regression model by employing Supervised machine learning approach. The R-statistical software was used for the analysis of the data. The test data set was used to do cross-validation of the developed logistic model which later was used for analysis prediction of individual loan defaults. This study focused on the analysis of individual loan defaults in Kenya using the logistic regression model in Machine learning. The logistic regression model predicted 303 defaults from train data set, 122 non-defaults and misclassified loans were 56 and 69. The model had an accuracy of 0.7727 with the train data and 0.7333 with the test data. The logistic regression model showed a precision of 0.8440 and 0.8244 with the train and test data respectively. The performance of the model with both the train and test data was illustrated using a plot of train errors and test errors against sample size on the same axes. The plot showed that the performance of the model increases with an increase in sample size. The study recommended the use of logistic regression in conjunction with supervised machine learning approach in loan default prediction in financial institutions and also more research should be carried out on ensemble methods of loan defaults prediction in order to increase the prediction accuracy.
Abstract:This study sought to estimate finite population total using spline functions. The emerging patterns from spline smoother were compared with those that were obtained from the model-based, the model-assisted and the non-parametric estimators. To measure the performance of each estimator, three aspects were considered: the average bias, the efficiency by use of the average mean square error and the robustness using the rate of change of efficiency. We used six populations: four natural and two simulated. The findings showed that the model-based estimator works very well in terms of efficiency while the model-assisted is almost unbiased when the model is linear and homoscedastic. However, the estimators break down when the underlying model assumptions are violated. The Kernel Estimator (Nadaraya-Watson) is found to be the most robust of the five estimators considered. Between the two spline functions that we considered, the periodic spline was found to perform better. The spline functions were found to provide good results whether or not the design points were uniformly spaced. We also found out that, under certain conditions, a smoothing spline estimator and a Kernel estimator are equivalent. The study recommends that both the ratio estimator and the local polynomial estimator should be used within the confines of a linear homoscedastic model. The Nadaraya-Watson and the periodic spline estimators, both of which are non-parametric, are highly robust. The Nadaraya-Watson however is even more robust than the periodic spline.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.