Maurice A. W. Houtsma scite author profile

Maurice A. W. Houtsma

4Publications

167Citation Statements Received

82Citation Statements Given

How they've been cited

275

167

How they cite others

Affiliations

University of Twente, IBM Research - Almaden

Publications

Order By: Most citations

Set-oriented mining for association rules in relational databases

View full text Add to dashboard Cite

We describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and may appear to be inherently less escient than special-purpose algorithms. W e develop new algorithms that can be expressed as SQL queries, and discuss optimization of these algorithms. After analytical evaluation, an algorithm named S E T M emerges as the algorithm of choice. Algorithm S E T M uses only simple database primitives, viz., sorting and merge-scan join. Algorithm S E T M is simple, fast, and stable over the mnge of pammeter values. The major contribution of this paper is that it shows that at least some aspects of data mining can be cam'ed out by using general query languages such as SQL, mther than by developing specialized black box algorithms. The set-oriented nature of Algorithm S E T M facilitates the development of extensions.

show abstract

Parallel hierarchical evaluation of transitive closure queries

Houtsma

Cacace²,

Ceri³

View full text Add to dashboard Cite

A survey of parallel execution strategies for transitive closure and logic programs

Cacace

Ceri

Houtsma

1993

Distrib Parallel Databases

View full text Add to dashboard Cite

An important feature of database technology of the nineties is the use of parallelism for speeding up the execution of complex queries. This technology is being tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors in order to perform selections and joins in parallel. With the development of new query languages, and in particular with the definition of transitive closure queries and of more general logic programming queries, the new dimension of recursion has been added to query processing. Recursive queries are complex; at the same time, their regular structure is particularly suited for parallel execution, and parallelism may give a high efficiency gain. We survey the approaches to parallel execution of recursive queries that have been presented in the recent literature. We observe that research on parallel execution of recursive queries is separated into two distinct subareas, one focused on the transitive closure of Relational Algebra expressions, the other one focused on optimization of more general Datalog queries. Though the subareas seem radically different because of the approach and formalism used, they have many common features. This is not surprising, because most typical Datalog queries can be solved by means of the transitive closure of simple algebraic expressions. We first analyze the relationship between the transitive closure of expressions in Relational Algebra and Datalog programs. We then review sequential methods for evaluating transitive closure, distinguishing iterative and direct methods. We address the parallelization of these methods, by discussing various forms of parallelization. Data fragmentation plays an important role in obtaining parallel execution; we describe hash-based and semantic fragmentation. Finally, we consider Datalog queries, and present general methods for parallel rule execution; we recognize the similarities between these methods and the methods reviewed previously, when the former are applied to linear Datalog queries. We also provide a quantitative analysis that shows the impact of the initial data distribution on the performance of methods.

show abstract

Set-oriented data mining in relational databases

Houtsma

Swami

1995

Data & Knowledge Engineering

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.