Abstract-Relations between similarity-based systems, evaluating similarity to some prototypes, and fuzzy rule-based systems, aggregating values of membership functions, are investigated. Similarity measures based on information theory and probabilistic distance functions lead to a new type of membership functions applicable to symbolic data. Fuzzy membership functions on the other hand lead to a new type of distance functions. Several such novel functions are presented. This approach opens new ways to generate fuzzy rules based either on individual features or on their combinations used to evaluate similarity. Transition from prototype-based rules using similarity and fuzzy rules is illustrated using artificial data in two dimensions. As an illustration of usefulness of prototype-based rules very simple rules are derived for leukemia gene expression data.
In this paper the application of ensembles of instance selection algorithms to improve the quality of dataset size reduction is evaluated. In order to ensure diversity of sub models, selection of a feature subsets was considered. In the experiments the Condensed Nearest Neighbor (CNN) and Edited Nearest Neighbor (ENN) algorithms were evaluated as basic instance selection methods. The results show that it is possible to obtain various trade-offs between data compression and classification accuracy depending on the acceptance threshold and feature ratio parameters. In some cases it was possible to achieve both: higher compression and higher accuracy than those of an individual instance selection algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.