Paul Breimyer scite author profile

Paul Breimyer

3Publications

14Citation Statements Received

90Citation Statements Given

How they've been cited

How they cite others

Affiliations

MIT Lincoln Laboratory, North Carolina State University, Oak Ridge National Laboratory

Publications

Order By: Most citations

Incremental all pairs similarity search for varying similarity thresholds

Awekar

Samatova

Breimyer

2009

View full text Add to dashboard Cite

All Pairs Similarity Search (AP SS) is a ubiquitous problem in many data mining applications and involves finding all pairs of records with similarity scores above a specified threshold. In this paper, we introduce the problem of Incremental All Pairs Similarity Search (IAP SS), where AP SS is performed multiple times over the same dataset by varying the similarity threshold. To the best of our knowledge, this is the first work that addresses the IAP SS problem. All existing solutions for AP SS perform redundant computations by invoking AP SS independently for each threshold value. In contrast, our solution to the IAP SS problem avoids redundant computations by storing the history of previous AP SS invocations and using index splitting. While offering obvious benefits, the computation and I/O intensive nature of the IAP SS solution raises two key research challenges: (1) to develop efficient I/O techniques to manage computation history and (2) to efficiently identify and prune redundant computations. We address these challenges through the proposed (a) history binning technique that clusters record pairs based on similarity values and performs I/O during the similarity computation, and (b) splitting of inverted index that maps each dimension to a list of records that have a non-zero projection along that dimension. As a result, we evaluate the effectiveness of our techniques by demonstrating speed-ups in the order of 2X to over 10 5 X over the state-of-the-art AP SS algorithm for four real-world large-scale datasets.

show abstract

Enabling distributed command and control with standards-based geospatial collaboration

Ciaccio

Pullen

Breimyer

2011

View full text Add to dashboard Cite

An outlook into ultra-scale visualization of large-scale biological data

Samatova

Breimyer

Hendrix

et al. 2008

View full text Add to dashboard Cite

As bioinformatics has evolved from a reductionistic approach to a complementary multi-scale integrative approach, new challenges in ultra-scale visualization have arisen. Even though visualization is a critical component to large-scale biological data analysis, the ultra-scale nature of systems biology has given rise to novel problems in visualization that are not addressed by existing methods. Visualization is a rich and actively researched domain, and there are many open research questions pertaining to the increasing demands of visualization in bioinformatics. In this paper, we present several broadly important ultra-scale visualization challenges and discuss specific examples of ultrascale applications in systems biology.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Paul Breimyer

Incremental all pairs similarity search for varying similarity thresholds

Enabling distributed command and control with standards-based geospatial collaboration

An outlook into ultra-scale visualization of large-scale biological data

Contact Info

Product

Resources

About