The United Nations’ Sustainable Development Goals (SDGs) set out to improve the quality of life of people in developed, emerging, and developing countries by covering social and economic aspects, with a focus on environmental sustainability. At the same time, data-driven technologies influence our lives in all areas and have caused fundamental economical and societal changes. This study presents a comprehensive literature review on how data-driven approaches have enabled or inhibited the successful achievement of the 17 SDGs to date. Our findings show that data-driven analytics and tools contribute to achieving the 17 SDGs, e.g., by making information more reliable, supporting better-informed decision-making, implementing data-based policies, prioritizing actions, and optimizing the allocation of resources. Based on a qualitative content analysis, results were aggregated into a conceptual framework, including the following categories: (1) uses of data-driven methods (e.g., monitoring, measurement, mapping or modeling, forecasting, risk assessment, and planning purposes), (2) resulting positive effects, (3) arising challenges, and (4) recommendations for action to overcome these challenges. Despite positive effects and versatile applications, problems such as data gaps, data biases, high energy consumption of computational resources, ethical concerns, privacy, ownership, and security issues stand in the way of achieving the 17 SDGs.
The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues. These issues need to be carefully addressed by allowing a flexible, customized and industry-specific knowledge discovery framework. For this reason, extensions of CRISP-DM are needed. In this paper, we provide a detailed review of CRISP-DM and summarize extensions of this model into a novel framework we call Generalized Cross-Industry Standard Process for Data Science (GCRISP-DS). This framework is designed to allow dynamic interactions between different phases to adequately address data- and model-related issues for achieving robustness. Furthermore, it emphasizes also the need for a detailed business understanding and the interdependencies with the developed models and data quality for fulfilling higher business objectives. Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues.
The implementation of robust, stable, and user-centered data analytics and machine learning models is confronted by numerous challenges in production and manufacturing. Therefore, a systematic approach is required to develop, evaluate, and deploy such models. The data-driven knowledge discovery framework provides an orderly partition of the data-mining processes to ensure the practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data-and model-development-related issues. These issues should be carefully addressed by allowing a flexible, customized, and industry-specific knowledge discovery framework; in our case, this takes the form of the cross-industry standard process for data mining (CRISP-DM). This framework is designed to ensure active cooperation between different phases to adequately address data-and model-related issues. In this paper, we review several extensions of CRISP-DM models and various data-robustness-and model-robustness-related problems in machine learning, which currently lacks proper cooperation between data experts and business experts because of the limitations of data-driven knowledge discovery models.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.