This paper aims to explore models based on the extreme gradient boosting (XGBoost) approach for business risk classification. Feature selection (FS) algorithms and hyper-parameter optimizations are simultaneously considered during model training. The five most commonly used FS methods including weight by Gini, weight by Chi-square, hierarchical variable clustering, weight by correlation, and weight by information are applied to alleviate the effect of redundant features. Two hyper-parameter optimization approaches, random search (RS) and Bayesian tree-structured Parzen Estimator (TPE), are applied in XGBoost. The effect of different FS and hyper-parameter optimization methods on the model performance are investigated by the Wilcoxon Signed Rank Test. The performance of XGBoost is compared to the traditionally utilized logistic regression (LR) model in terms of classification accuracy, area under the curve (AUC), recall, and F1 score obtained from the 10-fold cross validation. Results show that hierarchical clustering is the optimal FS method for LR while weight by Chi-square achieves the best performance in XG-Boost. Both TPE and RS optimization in XGBoost outperform LR significantly. TPE optimization shows a superiority over RS since it results in a significantly higher accuracy and a marginally higher AUC, recall and F1 score. Furthermore, XGBoost with TPE tuning shows a lower variability than the RS method. Finally, the ranking of feature importance based on XGBoost enhances the model interpretation. Therefore, XGBoost with Bayesian TPE hyper-parameter optimization serves as an operative while powerful approach for business risk modeling.
Recent results in homotopy and solution paths demonstrate that certain well-designed greedy algorithms, with a range of values of the algorithmic parameter, can provide solution paths to a sequence of convex optimization problems. On the other hand, in regression many existing criteria in subset selection (including $C_p$, AIC, BIC, MDL, RIC, etc.) involve optimizing an objective function that contains a counting measure. The two optimization problems are formulated as (P1) and (P0) in the present paper. The latter is generally combinatoric and has been proven to be NP-hard. We study the conditions under which the two optimization problems have common solutions. Hence, in these situations a stepwise algorithm can be used to solve the seemingly unsolvable problem. Our main result is motivated by recent work in sparse representation, while two others emerge from different angles: a direct analysis of sufficiency and necessity and a condition on the mostly correlated covariates. An extreme example connected with least angle regression is of independent interest.Comment: Published at http://dx.doi.org/10.1214/009053606000001334 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
Cell-penetrating peptides (CPPs) are capable of transporting molecules to which they are tethered across cellular membranes. Unsurprisingly, CPPs have attracted attention for their potential drug delivery applications, but several technical hurdles remain to be overcome. Chief among them is the so-called ‘endosomal escape problem,’ i.e. the propensity of CPP-cargo molecules to be endocytosed but remain entrapped in endosomes rather than reaching the cytosol. Previously, a CPP fused to calmodulin that bound calmodulin binding site-containing cargos was shown to efficiently deliver cargos to the cytoplasm, effectively overcoming the endosomal escape problem. The CPP-adaptor, “TAT-CaM,” evinces delivery at nM concentrations and more rapidly than we had previously been able to measure. To better understand the kinetics and mechanism of CPP-adaptor-mediated cargo delivery, a real-time cell penetrating assay was developed in which a flow chamber containing cultured cells was installed on the stage of a confocal microscope to allow for observation ab initio. Also examined in this study was an improved CPP-adaptor that utilizes naked mole rat (Heterocephalus glaber) calmodulin in place of human and results in superior internalization, likely due to its lesser net negative charge. Adaptor-cargo complexes were delivered into the flow chamber and fluorescence intensity in the midpoint of baby hamster kidney cells was measured as a function of time. Delivery of 400 nM cargo was observed within seven minutes and fluorescence continued to increase linearly as a function of time. Cargo-only control experiments showed that the minimal uptake which occurred independently of the CPP-adaptor resulted in punctate localization consistent with endosomal entrapment. A distance analysis was performed for cell-penetration experiments in which CPP-adaptor-delivered cargo showing wider dispersions throughout cells as compared to an analogous covalently-bound CPP-cargo. Small molecule endocytosis inhibitors did not have significant effects upon delivery. The real-time assay is an improvement upon static endpoint assays and should be informative in a broad array of applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.