Introduction: Huge volumes of data are generated in cyberspace or from internal information of various organizations. In order to obtain a set of output data with a clear structure, divide it into significant parts and develop rules of classification, machine learning methods are used. Most inductive methods simulate intermediate and high-level abstract categories in multidimensional space which are difficult to interpret. Purpose: Developing a model of machine learning in the form of a “white box” which explains the chosen solution using conventional production rules, along with cognitive visualizers for characterizing classes of objects. Methods: Formation of a binary decision matrix containing information about a combination of the selected informative sign values which imply the specified classes. Results: A binary decision matrix is formed automatically according to the results of cluster and discriminant analyzes. The learning procedure is reduced to setting interval thresholds and matrix elements, which makes it easy to implement a semantic interpretation of a solving rule. The object is recognized by elementwise conjunction of the matrix cells to which the values of the attributes are pointing, and by selection of a single cell corresponding to the class code. To interpret a rule, a universal algorithm for processing a binary matrix has been developed, which applies user-entered attribute values. The dimension of the viewed space is specified by adjustment rings on the recognition visualizer. The azimuth of an initiated diagram cell with the greatest dimensionality indicates the belonging of an object with the set features to a target class. For the characterization of classes, visualizers have been developed, demonstrating both the distinctive properties of a class and properties that several classes share. In many cases, the object type recognition stops when the depth of the scanned features space is significantly less than with a full search. Practical relevance: The proposed methods of cognitive analysis and data visualization provide not only the classification of data, determination of the significance of features, their ranking and selection, but also the development of rules which reveal the cause-and-effect relationship between the combination of factors and the type of a made decision.
Introduction: Artificial intelligence development strategy involves the use of deep machine learning algorithms in order to solve various problems. Neural network models trained on specific data sets are difficult to interpret, which is due to the “black box” approach when knowledge is formed as a set of interneuronal connection weights. Purpose: Development of a discrete knowledge model which explicitly represents information processing patterns encoded by connections between neurons. Methods: Adaptive quantization of a feature space using a genetic algorithm, and construction of a discrete model for a multidimensional OLAP cube with binary measures. Results: A genetic algorithm extracts a discrete knowledge carrier from a trained neural network. An individual's chromosome encodes a combination of values of all quantization levels for the measurable object properties. The head gene group defines the feature space structure, while the other genes are responsible for setting up the quantization of a multidimensional space, where each gene is responsible for one quantization threshold for a given variable. A discrete model of a multidimensional OLAP cube with binary measures explicitly represents the relationships between combinations of object feature values and classes. Practical relevance: For neural network prediction models based on a training sample, genetic algorithms make it possible to find the effective value of the feature space volume for the combinations of input feature values not represented in the training sample whose volume is usually limited. The proposed discrete model builds unique images of each class based on rectangular maps which use a mesh structure of gradations. The maps reflect the most significant integral indicators of classes that determine the location and size of a class in a multidimensional space. Based on a convolution of the constructed class images, a complete system of production decision rules is recorded for the preset feature gradations.
The accumulation of data on project management processes and standard solutions has made relevant research related to the use of knowledge engineering methods for a multi-criteria search for options that set optimal settings for project environment parameters. Purpose: Development of a method for searching and visualizing groups of projects that can be evaluated based on the concept of dominance and interpreted in terms of project variables and performance indicators. Methods: The enrichment of the sample while maintaining an implicit link between the project variables and performance indicators is carried out using a predictive neural network model. A set of genetic algorithms is used to detect the Pareto front in the multidimensional criterion space. The ontology of projects is determined after clustering options in the solution space and transforming the cluster structure into the criterion space. Automation of the search in the multidimensional space of the Pareto front greatest curvature zone, which determines the equilibrium design solutions, their visualization and interpretation are carried out using a tree map. Results: A tree map is constructed at any dimension of the criterion space and has a structure that has a topological correspondence with projections of shared cluster images from a multidimensional space onto a plane. For various types of transformations and correlations between performance indicators and project variables, it is shown that the areas of the Pareto front greatest curvature are determined either by the contents of the whole cluster or by part of the variants representing the "best" cluster. If an undivided rectangle of a cluster is adjacent to the upper right corner of a tree map, then its representatives in the criterion space are well separated from the rest of the clusters and, when maximizing performance indicators, are closest to the ideal point. All representatives of such a cluster are effective solutions. If the winning cluster contains dominant options inside the decision tree, then the ”best" cluster is represented by the remaining options that set the optimal settings for the project variables. Practical relevance: The proposed methods of searching and visualizing groups of projects can be used when choosing the conditions of resource and organizational and economic modeling of the project environment, ensuring the optimization of risks, cost, functional, and time criteria.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.