In this paper we propose a lattice-based approach intended for extracting semantics from datacubes: borders of version spaces for supervised classification, closed cube lattice to summarize the semantics of datacubes w.r.t. COUNT, SUM, and covering graph of the quotient cube as a visualization tool of minimal multidimensional associations. With this intention, we introduce two novel concepts: the cube transversals and the cube closures over the cube lattice of a categorical database relation. We propose a levelwise merging algorithm for mining minimal cube transversals with a single database scan. We introduce the cube connection, show that it is a Galois connection and derive a closure operator over the cube lattice. Using cube transversals and closures, we define a new characterization of boundary sets which provide a condensed representation of version spaces used to enhance supervised classification. The algorithm designed for computing such borders improves the complexity of previous proposals. We also introduce the concept of closed cube lattice and show that it is isomorph to on one hand the Galois lattice and on the other hand the quotient cube w.r.t. COUNT, SUM. Proposed in [16], the quotient cube is a succinct summary of a datacube preserving the Rollup/Drilldown semantics. We show that the quotient cube w.r.t. COUNT, SUM and the closed cube lattice have a similar expression power but the latter has the smallest possible size. Finally we focus on the multidimensional association issue and introduce the covering graph of the quotient cube which provides the user with a visualization tool of minimal multidimensional associations.
Constrained multidimensional patterns differ from the well-known frequent patterns from a conceptual and logical points of view because they are provided with a common structure and support various types of constraints. Classical data mining techniques are based on the power set lattice of binary attributes and, even extended, are not suitable when addressing the discovery of constrained multidimensional patterns. In this paper we propose a foundation for various multidimensional data mining problems by introducing a new algebraic structure called cube lattice which characterizes the search space to be explored. We take into consideration monotone and/or antimonotone constraints enforced when mining multidimensional patterns. In addition, we propose condensed representations of the constrained cube lattice which is a convex space. Finally, we place emphasis on advantages of the cube lattice when compared to the power set lattice of binary attributes used for multidimensional data mining.
Abstract-Data mining tools are nowadays becoming more and more popular in the semiconductor manufacturing industry, and especially in yield-oriented enhancement techniques. This is because conventional approaches fail to extract hidden relationships between numerous complex process control parameters. In order to highlight correlations between such parameters, we propose in this paper a complete knowledge discovery in databases (KDD) model. The mining heart of the model uses a new method derived from association rules programming, and is based on two concepts: decision correlation rules and contingency vectors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.