Attribute reduction from decision tables is one of the crucial topics in data mining. This problem belongs to NP-hard and many approximation algorithms based on the filter or the filter-wrapper approaches have been designed to find the reducts. Intuitionistic fuzzy set (IFS) has been regarded as the effective tool to deal with such the problem by adding two degrees, namely the membership and non-membership for each data element. The separation of attributes in the view of two counterparts as in the IFS set would increase the quality of classification and reduce the reducts. From this motivation, this paper proposes a new filter-wrapper algorithm based on the IFS for attribute reduction from decision tables. The contributions include a new instituitionistics fuzzy distance between partitions accompanied with theoretical analysis. The filter-wrapper algorithm is designed based on that distance with the new stopping condition based on the concept of delta-equality. Experiments are conducted on the benchmark UCI machine learning repository datasets.
Expensive user-defined functions impose unique challenges to database management systems at query time. This is mostly due to the black-box nature of these functions, the inability to optimize their internals, and the potential inefficiency of the common optimization heuristics, e.g., "selection-push-down'. Moreover, the increasing diversity of modern scientific applications that depend on DBMSs and, at the same time, extensively use expensive UDFs is mandating the design and development of efficient techniques to support these expensive functions. In this paper, we propose the "FunctionGuard" system that leverages disk-based persistent caching in novel ways to achieve across-queries optimizations for expensive UDFs. The unique features of FunctionGuard include: (1) Dynamic extraction of dependencies between the UDFs and the data sources and identifying the potential cacheable functions, (2) Cache-aware query optimization through newly introduced query operators, (3) Proactive cache refreshing that partially migrates the cost of the expensive calls from the query time to the idle and under-utilized times, and (4) Integration with the state-of-art techniques that generate efficient query plans under the presence of expensive functions. The system is implemented within PostgreSQL DBMS, and the results show the effectiveness of the proposed algorithms and optimizations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.