Shantanu Joshi scite author profile

Shantanu Joshi

5Publications

69Citation Statements Received

81Citation Statements Given

How they've been cited

104

How they cite others

Affiliations

Oracle (United States), University of Florida

Publications

Order By: Most citations

A disk-based join with probabilistic guarantees

Jermaine

Dobra

Arumugam

et al. 2005

View full text Add to dashboard Cite

One of the most common operations in analytic query processing is the application of an aggregate function to the result of a relational join. We describe an algorithm for computing the answer to such a query over large, disk-based input tables. The key innovation of our algorithm is that at all times, it provides an online, statistical estimator for the eventual answer to the query, as well as probabilistic confidence bounds. Thus, a user can monitor the progress of the join throughout its execution and stop the join when satisfied with the estimate's accuracy, or run the algorithm to completion with a total time requirement that is not much longer than other common join algorithms. This contrasts with other online join algorithms, which either do not offer such statistical guarantees or can only offer guarantees so long as the input data can fit into core memory.

show abstract

Materialized Sample Views for Database Approximation

Joshi

Jermaine

2006

View full text Add to dashboard Cite

We consider the problem of creating a sample view of a database table. A sample view is an indexed, materialized view that permits efficient sampling from an arbitrary range query over the view. Such "sample views" are very useful to applications that require random samples from a database: approximate query processing, online aggregation, data mining, and randomized algorithms are a few examples. Our core technical contribution is a new file organization called the ACE Tree that is suitable for organizing and indexing a sample view. One of the most important aspects of the ACE Tree is that it supports online random sampling from the view. That is, at all times, the set of records returned by the ACE Tree constitutes a statistically random sample of the database records satisfying the relational selection predicate over the view. Our paper presents experimental results that demonstrate the utility of the ACE Tree.

show abstract

Sampling-based estimators for subset-based queries

Joshi

Jermaine

2008

The VLDB Journal

View full text Add to dashboard Cite

Robust Stratified Sampling Plans for Low Selectivity Queries

Joshi

Jermaine

2008

View full text Add to dashboard Cite

The Sort-Merge-Shrink join

Jermaine

Dobra

Arumugam

et al. 2006

ACM Trans. Database Syst.

View full text Add to dashboard Cite

One of the most common operations in analytic query processing is the application of an aggregate function to the result of a relational join. We describe an algorithm called the Sort-Merge-Shrink (SMS) Join for computing the answer to such a query over large, disk-based input tables. The key innovation of the SMS join is that if the input data are clustered in a statistically random fashion on disk, then at all times, the join provides an online, statistical estimator for the eventual answer to the query as well as probabilistic confidence bounds. Thus, a user can monitor the progress of the join throughout its execution and stop the join when satisfied with the estimate's accuracy or run the algorithm to completion with a total time requirement that is not much longer than that of other common join algorithms. This contrasts with other online join algorithms, which either do not offer such statistical guarantees or can only offer guarantees so long as the input data can fit into main memory.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shantanu Joshi

A disk-based join with probabilistic guarantees

Materialized Sample Views for Database Approximation

Sampling-based estimators for subset-based queries

Robust Stratified Sampling Plans for Low Selectivity Queries

The Sort-Merge-Shrink join

Contact Info

Product

Resources

About