2018
DOI: 10.1016/j.bdr.2018.04.004
|View full text |Cite
|
Sign up to set email alerts
|

Big Data Systems Meet Machine Learning Challenges: Towards Big Data Science as a Service

Abstract: Recently, we have been witnessing huge advancements in the scale of data we routinely generate and collect in pretty much everything we do, as well as our ability to exploit modern technologies to process, analyze and understand this data. The intersection of these trends is what is called, nowadays, as Big Data Science. Cloud computing represents a practical and cost-effective solution for supporting Big Data storage, processing and for sophisticated analytics applications. We analyze in details the building … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
65
0
5

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 117 publications
(70 citation statements)
references
References 43 publications
0
65
0
5
Order By: Relevance
“…There is a substantial difference between the recall percentage of two-class SVM and two-class locally deep SVM (89.6% and 99.3% respectively). Two class SVM"s capability to detect false alarms has not been impressive since its FAR is reported to be as high as 16.5%. On the other hand, locally deep SVM has been comparatively better in reducing false alarms due to the application of sigmoid kernel.…”
Section: Logistic Regressionmentioning
confidence: 99%
See 1 more Smart Citation
“…There is a substantial difference between the recall percentage of two-class SVM and two-class locally deep SVM (89.6% and 99.3% respectively). Two class SVM"s capability to detect false alarms has not been impressive since its FAR is reported to be as high as 16.5%. On the other hand, locally deep SVM has been comparatively better in reducing false alarms due to the application of sigmoid kernel.…”
Section: Logistic Regressionmentioning
confidence: 99%
“…Performance and scalability are the two major considerations for conducting network intrusion detection study. Big data processing platforms like Pig [13], Spark machine learning [14] and Azure machine learning [15] are the preferred choices in ISSN: 2088-8708  Performance analysis of binary and multiclass models using azure machine learning (Smitha Rajagopal) 979 the modern scenario given their ability to uphold memory requirements and implementation essentials [16]. Going by these considerations, it is imperative to introduce radical advancements to intrusion detection infrastructure.…”
Section: Introductionmentioning
confidence: 99%
“…The concept of big data generated an ever-increasing pressure for scalable data processing solutions, leading to the implementation of several data processing and management systems recently. A summary of the major features of big data analysis frameworks is presented in Table 1 [29]. The increasing data analysis requirements in virtually all application areas have necessitated need to design and build a new set of big data science tools which can seamlessly analyze huge data volumes, extract vital information, and establish important patterns and knowledge from such datasets [30].…”
Section: Big Data Analysis Frameworkmentioning
confidence: 99%
“…In this literature reviews, we would like to focus on data science frameworks as follows: [17] captured data science, data source, data scale, data story, and data scientists; [18] explained about data science expertise, venn diagrams, goals and deliverables, process, skills and education, data analysts and data engineers, also the data scientist's toolbox; [19] described data science definition, comparing data science with data analysis, process of data science, tools, skills, scope, advantages, how data science is different from big data; [20] explained data insight, data product, the skill set requirement, analytics and machine learning; [21] emphasized the life cycle of data science and data scientist profile; [22] presented structured and unstructured data, business intelligence and data science, life cycle, model planning and building tools; [23] reviewed about statistics and associated data science methods in bioimage informatics; [24] emphasized software for supporting big data science for data scientists and big data analytics frameworks based on clouds;…”
Section: B Data Sciencementioning
confidence: 99%