In the domain of credit risk assessment lenders may have limited or no data on the historical lending outcomes of credit applicants. Typically this disproportionately affects Micro, Small, and Medium Enterprises (MSMEs), for which credit may be restricted or too costly, due to the difficulty of predicting the Probability of Default (PD). However, if data from other related credit risk domains is available Transfer Learning may be applied to successfully train models, e.g., from the credit card lending and debt consolidation (CD) domains to predict in the small business lending domain. In this article, we report successful results from an approach using transfer learning to predict the probability of default based on the novel concept of Progressive Shift Contribution (PSC) from source to target domain. Toward real-world application by lenders of this approach, we further address two key questions. The first is to explain transfer learning models, and the second is to adjust features when the source and target domains differ. To address the first question, we apply Shapley values to investigate how and why transfer learning improves model accuracy, and also propose and test a domain adaptation approach to address the second. These results show that adaptation improves model accuracy in addition to the improvement from transfer learning. We extend this by proposing and testing a combined strategy of feature selection and adaptation to convert values of source domain features to better approximate values of target domain features. Our approach includes a strategy to choose features for adaptation and an algorithm to adapt the values of these features. In this setting, transfer learning appears to improve model accuracy by increasing the contribution of less predictive features. Although the percentage improvements are small, such improvements in real world lending could be of significant economic importance.
Abstract. Knowledge-based systems (KBS) are not necessarily based on a well-defined ontologies. In particular it is possible to build very successful KBS for classification problems, but where the classes or conclusions are entered by experts as free-text sentences with little constraint on textual consistency and little systematic organisation of the conclusions. This paper investigates how relations between such 'classes' may be discovered from existing knowledge bases. We have based our approach on KBS built with Ripple Down Rules (RDR). RDR is a knowledge acquisition and knowledge maintenance method which allows KBS to be built very rapidly and simply by correcting errors, but does not require a strong ontology. Our experimental results are based on a large real-world medical RDR KBS. The motivation for our work is to allow an ontology in a KBS to 'emerge' during development, rather than requiring the ontology to be established prior to the development of the KBS. It follows earlier work on using Formal Concept Analysis (FCA) to discover ontologies in RDR KBS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.