Data Science is an emerging field of science, which requires a multidisciplinary approach and should be built with a strong link to emerging Big Data and data driven technologies, and consequently needs rethinking and redesign of both traditional educational models and existing courses. The education and training of Data Scientists currently lacks a commonly accepted, harmonized instructional model that reflects by design the whole lifecycle of data handling in modern, data driven research and the digital economy. This paper presents the EDISON Data Science Framework (EDSF) that is intended to create a foundation for the Data Science profession definition. The EDSF includes the following core components: Data Science Competence Framework (CF-DS), Data Science Body of Knowledge (DS-BoK), Data Science Model Curriculum (MC-DS), and Data Science Professional profiles (DSP profiles). The MC-DS is built based on CF-DS and DS-BoK, where Learning Outcomes are defined based on CF-DS competences and Learning Units are mapped to Knowledge Units in DS-BoK. In its own turn, Learning Units are defined based on the ACM Classification of Computer Science (CCS2012) and reflect typical courses naming used by universities in their current programmes. The paper provides example how the proposed EDSF can be used for designing effective Data Science curricula and reports the experience of implementing EDSF by the Champion Universities that cooperate with the EDISON project.
Data Science is an emerging field of science, which requires a multidisciplinary approach and is based on the Big Data and data intensive technologies that both provide a basis for effective use of the data driven research and economy models. Modern data driven research and industry require new types of specialists that are capable to support all stages of the data lifecycle from data production and input to data processing and actionable results delivery, visualisation and reporting, which can be jointly defined as the Data Science professions family. The education and training of Data Scientists currently lacks a commonly accepted, harmonized instructional model that reflects all multidisciplinary knowledge and competences that are required from the Data Science practitioners in modern, data driven research and the digital economy. The educational model and approach should also solve different aspects of the future professionals that includes both theoretical knowledge and practical skills that must be supported by corresponding education infrastructure and educational labs environment. In modern conditions with the fast technology change and strong skills demand, the Data Science education and training should be customizable and delivered in multiple form, also providing sufficient data labs facilities for practical training. This paper discussed both aspects: building customizable Data Science curriculum for different types of learners and proposing a hybrid model for virtual labs that can combine local university facility and use cloud based Big Data and Data analytics facilities and services on demand. The proposed approach is based on using the EDISON Data Science Framework (EDSF) developed in the EU funded Project EDISON and CYCLONE cloud automation systems being developed in another EU funded project CYCLONE.
Real-time monitoring of multiphase fluid flows with distributed fibre optic sensing has the potential to play a major role in industrial flow measurement applications. One such application is the optimization of hydrocarbon production to maximize short-term income, and prolong the operational lifetime of production wells and the reservoir. While the measurement technology itself is well understood and developed, a key remaining challenge is the establishment of robust data analysis tools that are capable of providing real-time conversion of enormous data quantities into actionable process indicators. This paper provides a comprehensive technical review of the data analysis techniques for distributed fibre optic technologies, with a particular focus on characterizing fluid flow in pipes. The review encompasses classical methods, such as the speed of sound estimation and Joule-Thomson coefficient, as well as their data-driven machine learning counterparts, such as Convolutional Neural Network (CNN), Support Vector Machine (SVM), and Ensemble Kalman Filter (EnKF) algorithms. The study aims to help end-users establish reliable, robust, and accurate solutions that can be deployed in a timely and effective way, and pave the wave for future developments in the field.
The digital revolution made available vast amounts of data both in industry and in the research landscape. The ability to manipulate and extract knowledge and value from this data represents a new profession called the Data Scientist: expected to be the most visible job in future years.The EDISON project has been established in order to support Universities, Research Centers, Industry and Research Infrastructure organisations to cope with the potential shortfall of Data Scientists, to define the framework of competences as well as the body of knowledge for this profession.In this paper the EDISON team describes how it intends to nurture the profession of Data Scientist to cope with the expected increase in demand. The strategy proposed is based on both the analysis of the demand side (Industries, Research Centers and Research Infrastructure organisations) and the supply side (Universities and training centers) bridging between the providers and employers by cooperating on the establishment of a Competence Framework and a Body of Knowledge for the Data Scientist Professional. The project will exploit piloting initiatives in cooperation with pioneer universities and also involve external experts as evangelists.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.