In recent years, there have been frequent incidents of financial fraud committed through various means. How to more efficiently identify financial fraud and maintain capital market order is a problem that scholars from all walks of life are discussing and urgently seeking to resolve. In this study, a financial fraud identification model is constructed based on the stacking ensemble learning algorithm, and the text of the management discussion and analysis (MD&A) chapter in annual reports is introduced based on financial and nonfinancial variables, using sentiment polarity, emotional tone, and text readability as text variables. The results show that when considering financial and nonfinancial variables and introducing text variables, the recognition effect of the stacking ensemble learning model constructed in this study is significantly better than the classification results of each single classifier model. In addition, the model recognition effect is better after adding text variables. Therefore, the model is expected to provide a new and more effective method of identifying financial fraud.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.