Abstract-Rapid progress in digital data acquisition techniques have led to huge volume of data. More than 80 percent of today's data is composed of unstructured or semi-structured data. The discovery of appropriate patterns and trends to analyze the text documents from massive volume of data is a big issue. Text mining is a process of extracting interesting and nontrivial patterns from huge amount of text documents. There exist different techniques and tools to mine the text and discover valuable information for future prediction and decision making process. The selection of right and appropriate text mining technique helps to enhance the speed and decreases the time and effort required to extract valuable information. This paper briefly discuss and analyze the text mining techniques and their applications in diverse fields of life. Moreover, the issues in the field of text mining that affect the accuracy and relevance of results are identified.
Abstract-The rapid growth in size of data sets poses challenge to extract and analyze information in timely manner for better prediction and decision making. Data warehouse is the solution for strategic decision making. Data warehouse serves as a repository to store historical and current data. Extraction, Transformation and Loading (ETL) process gather data from different sources and integrate it into data warehouse. This paper proposes a multi-agent framework that enhance the efficiency of ETL process. Agents perform specific task assigned to them. The identification of errors at different stages of ETL process become easy. This was difficult and time consuming in traditional ETL process. Multi-agent framework identify data sources, extract, integrate, transform, and load data into data warehouse. A monitoring agent remains active during this process and generate alerts if there is issue at any stage.
Predictive analytics has become an essential area of research in health informatics. The availability of multi-source and multi-modal data in healthcare has made the disease prediction, diagnosis, and medication process more effective and reliable. However, the analysis and decision making have become challenging task, particularly when data is in multiple formats and from different sources. In this study, different frameworks have been proposed to handle multi-nature data at different levels for predictive analytics. Dimensionality reduction techniques have been applied to extract relevant features to enhance the analysis. To improve the performance of predictive analytics at different fusion levels, the potential benefits of multi-modal data have been discussed. Moreover, notable improvement in prediction accuracy has been observed through experimental evaluation of the proposed frameworks. Furthermore, the issues which have been found during dimension reduction and fusion approaches have also been highlighted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.