The problem of risk classification and prediction, an essential research direction, aiming to identify and predict risks for various applications, has been researched in this paper.To identify and predict risks, numerous researchers build models on discovering hidden information of a label (positive credit or negative credit). Fuzzy logic is robust in dealing with ambiguous data and, thus, benefits the problem of classification and prediction. However, the way to apply fuzzy logic optimally depends on the characteristics of the data and the objectives, and it is extraordinarily tricky to find such a way. This paper, therefore, proposes a general membership function model for fuzzy sets (GMFMFS) in the fuzzy decision tree and extend it to the fuzzy random forest method. The proposed methods can be applied to identify and predict the credit risks with almost optimal fuzzy sets. In addition, we analyze the feasibility of our GMFMFS and prove our GMFMFS-based linear membership function can be extended to a nonlinear membership function without a significant increase in computing complex. Our GMFMFS-based fuzzy decision tree is tested with a real dataset of US credit, Susy dataset of UCI, and synthetic datasets of big data. The results of experiments further demonstrate the effectiveness and potential of our GMFMFS-based fuzzy decision tree with linear membership function and nonlinear membership function.
KEYWORDSfuzzy decision tree, fuzzy random forest, membership function, risk classification and prediction 310