Dennis Wegener scite author profile

The enormous growth of data in a variety of applications has increased the need for high performance data mining based on distributed environments. However, standard data mining toolkits per se do not allow the usage of computing clusters. The success of MapReduce for analyzing large data has raised a general interest in applying this model to other, data intensive applications. Unfortunately current research has not lead to an integration of GUI based data mining toolkits with distributed file system based MapReduce systems. This paper defines novel principles for modeling and design of the user interface, the storage model and the computational model necessary for the integration of such systems. Additionally, it introduces a novel system architecture for interactive GUI based data mining of large data on clusters based on MapReduce that overcomes the limitations of data mining toolkits. As an empirical demonstration we show an implementation based on Weka and Hadoop.

show abstract

GridR: An R-Based Grid-Enabled Tool for Data Analysis in ACGT Clinico-Genomics Trials

Wegener

Sengstag

Sfakianakis

et al. 2007

View full text Add to dashboard Cite

The ACGT project in retrospect: Lessons learned and future outlook

Bucur

Rüping

Sengstag

et al. 2011

Procedia Computer Science

View full text Add to dashboard Cite

GridR: An R-based tool for scientific data analysis in grid environments

Wegener

Sengstag

Sfakianakis

et al. 2009

Future Generation Computer Systems

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dennis Wegener

Grid-enabling data mining applications with DataMiningGrid: An architectural perspective

Toolkit-Based High-Performance Data Mining of Large Data on MapReduce Clusters

GridR: An R-Based Grid-Enabled Tool for Data Analysis in ACGT Clinico-Genomics Trials

The ACGT project in retrospect: Lessons learned and future outlook

GridR: An R-based tool for scientific data analysis in grid environments

Contact Info

Product

Resources

About