The qualitative information of companies' financial statements provides useful information that can increase the accuracy of bankruptcy prediction models. In this research, a dataset of 924,903 financial statements from 355,704 German companies classified into solvent, financially distressed, and bankrupt companies using the Amadeus database from Bureau van Dijk was examined. The results provide empirical evidence that a corpus linguistic approach implementing evidential strategy analysis towards financial statements helps to distinguish between companies' financial situations. They show that companies use different approaches and confidence assessments when evaluating their financial statements based on solvency and vary their use of evidential strategies accordingly. This leads to the proposition of a procedure to quantify and generate features based on the analysis of evidential strategies that can be used to improve corporate bankruptcy prediction. The results presented here stem from an interdisciplinary adaptation of linguistic findings and provide future research with another means of analysis in the area of text mining.
How can useful information extracted from unstructured data be used to contribute to a better prediction of corporate failure or bankruptcy? In this research, we examine a data set of 2,163,147 financial statements of German companies that are triple classified, i.e., solvent, financially distressed, and bankrupt. By classifying text features in terms of granularity and linguistic level of analysis, we show results for the potentials and limitations of approaches developed in this way. This study gives a first approach to evaluate and classify the likelihood of success of text mining approaches for extracting features that enhance the training database of AI-based solutions and improve corporate failure prediction models developed in this way. Our results are an indication that the adaptation of additional information sources for the financial evaluation of companies is indeed worthwhile, but approaches adapted to the context should be used instead of unspecific general text mining approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.