The rough set concept is a relatively new mathematical approach to vagueness and uncertainty in data. The rough set theory is a well-understood formal framework for building data mining models in the form of logic rules, on the basis of which it is possible to issue predictions that allow the classification of new cases. The indiscernibility relation and approximations based on this relation form the mathematical basis of the rough set theory. The classical topological definitions of rough approximations are based on this relation. Unlike the classical approaches it is possible to define rough approximations in an algebraic way. This paper represents a generalization of the algebraic approach suggested by the authors earlier. We use a set of discrete characteristic functions taking on values from finite sets (not necessarily Boolean values) and operations on them including comparison and Boolean operations, which we call the approximation language. We use the terms "exact upper approximation" and "exact lower approximation" to stress the fact that there can exist a variety of approximations but it is always possible to select the approximations that cannot be improved in the terms of the approximation language. We consider the process of generating logic rules based on the exact approximations in the case of arbitrary discrete characteristic functions taking on values from finite sets. Logic rules are naturally obtained from predicate formulae for the exact approximations. The introduced approach allows the generation of logic rules quickly and efficiently since only comparison operations with discrete values and Boolean operations with binary values are used to produce logic formulae.
One of the classical Data Mining problems is the problem of classifying new objects on the basis of available information when the information associated with these objects does not allow identifying them unambiguously as elements of some set. In such cases using rough sets theory is often an effective solution. This theory operates with such concepts as "indiscernible" elements and relations. A rough set is characterized by lower and upper approximations for finding which the authors earlier suggested an original algebraic method. The given method uses only logic operations, which makes the process of searching logic rules very quick and efficient.The upper and lower approximations of a rough set allow describing elements of this set as completely as it is possible from the viewpoint of available information. In this connection it seems interesting and important to find irreducible sets of features describing a rough set with the same "precision" as with the help of a full set of features (so called reducts). This problem is quite difficult and complicated and at present it does not have good solutions. Our paper continues research carried out by the authors earlier and we suggest a method for finding reducts based on eliminating non-salient features in the reverse order of their importance. The suggested procedure allows us to avoid exhaustive searching by extracting a predefined number of most significant reducts. In this paper we consider arbitrary features taking on their values from finite sets.
Modern Data Mining methods allow discovering non-trivial dependencies in large information arrays. Since these methods are used for processing and analysis of huge information volumes, reducing the number of features necessary for describing a discrete object is one of the most important problems.One of the classical problems in intelligent data analysis is the problem of classifying new objects based on some a-priori information. This information might not allow us to exactly classify an object as one belonging to a certain set. In such cases using rough sets theory may be an effective solution as this theory operates with the concept of "indiscernible" elements and ambiguous information.In this paper we introduce a concept of a local reduct as a reduced set of features allowing us to describe a particular subset of the original set with the same precision as with the help of the full set of features. A method has been suggested which allows finding reduced sets of features adequately describing a rough set without losing necessary information (so-called reducts), and also assessing the importance of each feature. The suggested method is based on the algebraic approach to finding rough set approximations developed by the authors earlier. The main idea of the developed approach is as follows: if the algebraic approximations of a rough set do not change substantially in the process of excluding features the resulting reduced set of features can be used instead of the original full set. Also the greater changes eliminating a particular feature causes in the approximations, the more important this feature is.
Пропонується метод фільтрації набору асоціативних правил, отриманих у результаті пошуку логічних залежностей. Кількість знайдених асоціативних правил за умови встановлених рівнів підтримки та довіри може бути досить великою й потребує скорочення. Метод дозволяє працювати з так званими "цікавими" правилами, які мають такі рівні підтримки та довіри, які значно відрізняються від очікуваних. Очікувані параметри розраховуються виходячи з припущення про незалежність ознак, що входять до лівої частини правила. Показано, як змінюються рівні підтримки та довіри "цікавих" асоціативних правил за умови залежності ознак в даних, які аналізуються.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.