Proceedings of the International Conference on Software and System Processes 2020
DOI: 10.1145/3379177.3388909
|View full text |Cite
|
Sign up to set email alerts
|

From Ad-Hoc Data Analytics to DataOps

Abstract: The collection of high-quality data provides a key competitive advantage to companies in their decision-making process. It helps to understand customer behavior and enables the usage and deployment of new technologies based on machine learning. However, the process from collecting the data, to clean and process it to be used by data scientists and applications is often manual, non-optimized and error-prone. This increases the time that the data takes to deliver value for the business. To reduce this time compa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0
3

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 56 publications
(19 citation statements)
references
References 10 publications
0
16
0
3
Order By: Relevance
“…Furthermore, the proposed PDS structure should comply with DataSecOps: a methodology for improving data quality and analysis within the lifecycle of data use. This methodology is oriented and integrated for data collection and analysis [16] [17]. DataSecOps includes security internalization in the DataOps data lifecycle and applies security policies and governance to the stages of data collection, access, and analysis.…”
Section: Proposal Of the Mydata Platform And The Pds Modelmentioning
confidence: 99%
“…Furthermore, the proposed PDS structure should comply with DataSecOps: a methodology for improving data quality and analysis within the lifecycle of data use. This methodology is oriented and integrated for data collection and analysis [16] [17]. DataSecOps includes security internalization in the DataOps data lifecycle and applies security policies and governance to the stages of data collection, access, and analysis.…”
Section: Proposal Of the Mydata Platform And The Pds Modelmentioning
confidence: 99%
“…Some bad practices in data manipulation can end up in a misleading interpretation of the achieved results. The use of tools that allow the selection of the pertinent steps in an ad-hoc designed pipeline helps to reduce programming errors [26]. The Pipeline object of the scikit-learn module allows combining several transformers and an estimator to create a combined estimator [25].…”
Section: Introductionmentioning
confidence: 99%
“…According to reference [43], DataOps can be defined as "an approach that accelerates the delivery of high-quality results by automation and orchestration of data life cycle stages". It brings speed and agility to the end-to-end process of data pipelines, from collection to delivery.…”
Section: Dataopsmentioning
confidence: 99%
“…Much more recently, new ways of working have begun to emerge and gain traction for companies to leverage on what these new technologies aim to deliver. Practices such as DevOps [118] [119] [120], DataOps [43] and MLOps [121] [44] are being introduced with the intention of advancing product automation as well as quality in terms of development, data and ML operations. Penners and Dyck [122] propose DevOps as a collaboration of teams working in development and IT operations within a software-intensive organization to deliver faster software changes [118].…”
Section: Digitalization: New Technologies and Ways Of Workingmentioning
confidence: 99%
See 1 more Smart Citation