The analysis on the financial data is highly crucial and critical as the results or the conclusion communicated based on the analysis can generate a greater impact on the personal and enterprise scale business processes. The primary source of the financial data is the business process and often the data is collected by automation tools deployed at various points of the business process data flow. The data entered in the business process is primary done by the stake holders of the process and at various levels of the process the data is modified, translated and sometimes completed transverter, due to which the impurities or anomalies are introduced in the data. These impurities, such as outliers and missing values, cause a high impact on the final decision after processing these datasets. Hence an appropriate pre-processing for financial data is the demand of the research. A good number of parallel research outcomes can be observed to solve these problems. Nonetheless, majority of the solutions are either highly time complex or not accurate effectively. Thus, this work proposes an automated framework for identification and imputation of the outliers using the iterative clustering method, identification and imputation of the missing values using Differential count based binary iterations method and finally the secure data storage using regression based key generation. The proposed framework has showcased nearly 100% accuracy in detection of outliers and missing values with highly improved time complexity.
Given an anonymous text, automatically attributing a name from a group of known writers is called "Authorship Attribution" (AA). It is a classification problem, and feature extraction techniques are initially applied, followed by the training of a model using a collection of texts whose authors are known. Numerous features, such as lexical, semantic, structural, n-grams, etc., can be used to identify the stylistic characteristics of writers. The authors of this research propose a novel approach to this problem by using sequential pattern mining on part-of-speech (PoS) tags. This paper introduces and discusses the concept of a Part-of-Speech Skip-Gram (PoSSG) that is different from traditional n-gram. A sequential pattern mining algorithm is applied to obtain PoSSG patterns, which are then used for authorship attribution tasks. Experimental studies on two different datasets: novels extracted from Project Gutenberg and Stamatatos06 Author Identification: C10-Attribution confirms that this approach of mining PoSSG patterns facilitates author identification.
The remarkable increase in competition within the insurance sector has resulted in an overwhelming number of insurance products being available in the market. With rapid development of recommendation system, how to accurately predict Insurance policies using user lifestyle choices has become more and more important. The problem with the traditional systems is data sparseness. This paper proposes a recommender system to predict insurance products for new and existing customers. The main goal of the proposed system is to generate personalized recommendations based on the user lifestyle practices. By providing accurate personalised recommendations, the customer experience with the insurers can be improved.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.