This paper describes the text data analysis in the course of management decision making. We examine in detail the process of collection of text data for further analysis and the use of imaging to increase the efficiency of human resources during collection and data preprocessing. A modification of the algorithm for creating an "n-gram cloud" visualization is proposed, which makes visualization accessible to people with visual impairments. Also, a method of visualization of n-gram vector representation models (word embedding) is proposed. On the basis of the conducted research, a part of a software package was implemented, which is responsible for creating interactive visualizations in a browser and interoperating with them.
This paper describes text data analysis in the course of managerial decision making. The process of collecting textual data for further analysis as well as the use of visualization in human control over the correctness of data collection is considered in depth. An algorithm modification for creating an "n-gram cloud" visualization is proposed, which can help to make visualization accessible to people with visual impairments. Also, a method of visualization of n-gram vector representation models (word embedding) is proposed. On the basis of the conducted research, a part of a software package was implemented, which is responsible for creating interactive visualizations in a browser and interoperating with them.
The paper considers the problem of integration, processing and mining of poorly structured data of medical information systems in order to make managerial decisions in healthcare. The problems of medical data are described, such as the lack of a sufficient structure, a large number of abbreviations characteristic of specific nosologies, the complexity of the automatic semantic interpretation of some fields. The authors demonstrated an approach to the search and disclosure of abbreviation in texts, based on a combination of machine and human processing. The method proposed by the authors, based on a hybrid approach combining the strengths of machine and human processing, made it possible to increase the number of abbreviations found by automatic methods by 21 %, and also opened up to 55 % of cases in the automated mode (with a probability of correctness above 70 %) and significantly reduce the time spent by specialists in processing the remaining reductions. Further research will be aimed at solving the following problems associated with the processing and specificity of medical data, such as a large number of spelling errors, specific grammatical constructions. Using a hybrid approach to preprocessing poorly structured data will increase the efficiency of management decisions in the field of healthcare by reducing the time spent by experts on their creation and support. The hybrid approach to the preprocessing of text data in Russian can be applied in other subject areas. However, it may be necessary to adjust the technique to the specifics of the processed data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.