Abstract-From last three decades, the relational databases are being used in many organizations of various natures such as Education, Health, Business and in many other applications. Traditional databases show tremendous performance and are designed to handle structured data with ACID (Atomicity, Consistency, Isolation, Durability) property to manage data integrity. In the current era, organizations are storing more data i.e. videos, images, blogs, etc. besides structured data for decision making. Similarly, social media and scientific applications are generating large amount of semi-structured data of varied nature. Relational databases cannot process properly and manage such large amount of data efficiently. To overcome this problem, another paradigm NoSQL databases is introduced to manage and process massive amount of unstructured data efficiently. NoSQL databases are divided into four categories and each category is used according to the nature and need of the specific problem. In this paper we will compare Oracle relational database and NoSQL graph database using optimized queries and physical database tuning techniques. The comparison is two folded: in the first iteration we compare various kinds of queries such as simpler query, database tuning of Oracle relational database such as sub databases and perform these queries in our desired environments. Secondly, for this comparison we will perform predictive analysis for the results obtained from our experiments.
Abstract-This paper presents a novel high speed clustering scheme for high-dimensional data stream. Data stream clustering has gained importance in different applications, for example, network monitoring, intrusion detection, and real-time sensing. High dimensional stream data is inherently more complex when used for clustering because the evolving nature of the stream data and high dimensionality make it non-trivial. In order to tackle this problem, projected subspace within the high dimensions and limited window sized data per unit of time are used for clustering purpose. We propose a High Speed and Dimensions data stream clustering scheme (HSDStream) which employs exponential moving averages to reduce the size of the memory and speed up the processing of projected subspace data stream. It works in three steps: i) initialization, ii) real-time maintenance of core and outlier micro-clusters, and iii) on-demand offline generation of the final clusters. The proposed algorithm is tested against high dimensional density-based projected clustering (HDDStream) for cluster purity, memory usage, and the cluster sensitivity. Experimental results are obtained for corrected KDD intrusion detection dataset. These results show that HSDStream outperforms the HDDStream in all performance metrics, especially, the memory usage and the processing speed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.