In this paper, we propose a novel intelligent methodology to construct a Bankruptcy Prediction Computation Model, which is aimed to execute a company's financial status analysis accurately. Based on the semantic data analysis and management, our methodology considers Semantic Database System as the core of the system. It comprises three layers: an Ontology of Bankruptcy Prediction, Semantic Search Engine, and a Semantic Analysis Graph Database system.The Ontological layer defines the basic concepts of the financial risk management as well as the objects that serve as sources of knowledge for predicting a company's bankruptcy. The Graph Database layer utilises a powerful semantic data technology, which serves as a semantic data repository for our model.The article provides a detailed description of the construction of the Ontology and its informal conceptual representation. We also present a working prototype of the Graph Database system, constructed using the Neo4j application, and show the connection between well-known financial ratios.We argue that this methodology which utilises state of the art semantic data management mechanisms enables data processing and relevant computations in a more efficient way than approaches using the traditional relational database. These give us solid grounds to build a system that is capable of tackling the data of any complexity level.
This paper studies a Bankruptcy Prediction Computational Model (BPCM model) -a comprehensive methodology of evaluating a company's bankruptcy level, which combines storing, structuring and pre-processing of raw financial data using semantic methods with machine learning analysis techniques. Raw financial data are interconnected, diverse, often potentially inconsistent, and open to duplication. The main goal of our research is to develop data pre-processing techniques where ontologies play a central role. We show how ontologies are used to extract and integrate information from different sources, prepare data for further processing, and enable communication in natural language. Our Ontology of Bankruptcy Prediction (OBP Ontology) which provides a conceptual framework for a company's financial analysis, is built in the widely established Protégé environment. An OBP Ontology can be effectively described with a Graph database (DB). A Graph DB expands the capabilities of traditional databases by tackling the interconnected nature of economic data and providing graph-based structures to store information, allowing the effective selection of the most relevant input features for the machine learning algorithm. To create and manage the BPCM Graph DB, we use the Neo4j environment and Neo4j query language, Cypher, to perform feature selection of the structured data. Selected key features are used for the supervised Neural Network with a Sigmoid activation function. The programming of this component is performed in Python. We illustrate the approach and advantages of semantic data preprocessing, applying it to a representative use case.
In this paper, we present a Generic Predictive Computational Model (GPCM) and apply it by building a Use Case for the FTSE 100 index forecasting. This involves the mining of heterogeneous data based on semantic methods (ontology), graph-based methods (knowledge graphs, graph databases) and advanced Machine Learning methods. The main focus of our research is data pre-processing aimed at a more efficient selection of input features. The GPCM model pipeline's cycles involve the propagation of the (initially raw) data to the Graph Database structured by an ontology and regular updates of the features' weights in the Graph Database by the feedback loop from the Machine Learning Engine. The Graph Database queries output the most valuable features that, in turn, serve as the input for the Machine Learning-based prediction. The end-product of this process is fed back to the Graph Database to update the weights.We report on practical experiments evaluating the effectiveness of the GPCM application in forecasting the FTSE 100 index. The underlying dataset contains multiple parameters related to predicting time-series data, where Long Short-Term Memory (LSTM) is known to be one of the most efficient machine learning methods. The most challenging task here has been to overcome the known restrictions of LSTM, which is capable of analysing one input parameter only. We solved this problem by combining several parallel LSTMs, a Concatenation unit, which merges the LSTMs' outputs (into a time-series matrix), and a Linear Regression Unit, which produces the final result.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.