Data Science is an emerging field of science, which requires a multidisciplinary approach and should be built with a strong link to emerging Big Data and data driven technologies, and consequently needs rethinking and redesign of both traditional educational models and existing courses. The education and training of Data Scientists currently lacks a commonly accepted, harmonized instructional model that reflects by design the whole lifecycle of data handling in modern, data driven research and the digital economy. This paper presents the EDISON Data Science Framework (EDSF) that is intended to create a foundation for the Data Science profession definition. The EDSF includes the following core components: Data Science Competence Framework (CF-DS), Data Science Body of Knowledge (DS-BoK), Data Science Model Curriculum (MC-DS), and Data Science Professional profiles (DSP profiles). The MC-DS is built based on CF-DS and DS-BoK, where Learning Outcomes are defined based on CF-DS competences and Learning Units are mapped to Knowledge Units in DS-BoK. In its own turn, Learning Units are defined based on the ACM Classification of Computer Science (CCS2012) and reflect typical courses naming used by universities in their current programmes. The paper provides example how the proposed EDSF can be used for designing effective Data Science curricula and reports the experience of implementing EDSF by the Champion Universities that cooperate with the EDISON project.
Data Science is an emerging field of science, which requires a multidisciplinary approach and is based on the Big Data and data intensive technologies that both provide a basis for effective use of the data driven research and economy models. Modern data driven research and industry require new types of specialists that are capable to support all stages of the data lifecycle from data production and input to data processing and actionable results delivery, visualisation and reporting, which can be jointly defined as the Data Science professions family. The education and training of Data Scientists currently lacks a commonly accepted, harmonized instructional model that reflects all multidisciplinary knowledge and competences that are required from the Data Science practitioners in modern, data driven research and the digital economy. The educational model and approach should also solve different aspects of the future professionals that includes both theoretical knowledge and practical skills that must be supported by corresponding education infrastructure and educational labs environment. In modern conditions with the fast technology change and strong skills demand, the Data Science education and training should be customizable and delivered in multiple form, also providing sufficient data labs facilities for practical training. This paper discussed both aspects: building customizable Data Science curriculum for different types of learners and proposing a hybrid model for virtual labs that can combine local university facility and use cloud based Big Data and Data analytics facilities and services on demand. The proposed approach is based on using the EDISON Data Science Framework (EDSF) developed in the EU funded Project EDISON and CYCLONE cloud automation systems being developed in another EU funded project CYCLONE.
Infrastructure as a Service (IaaS) is one of the provisioning models for Clouds as defined in the NIST Clouds definition. Although widely used, current IaaS implementations and solutions doesn't have common and well defined architecture model. The paper attempts to define a generic architecture for IaaS based on current research by authors in developing novel architectural framework for Infrastructure Services On-Demand (ISOD) provisioning that allows for combined network and IT resources provisioning. The paper proposes the Composable Services Architecture (CSA) that extends the SOA based Enterprise Service Bus (ESB) architecture for dynamically configurable virtualised services. The proposed CSA includes the Services Delivery Framework (CSA SDF) as another important component that defines the services provisioning workflow and supporting infrastructure for provisioned services lifecycle management. The CSA SDF extends existing lifecycle management frameworks with additional stages such as "Registration and Synchronisation" and "Provisioning Session Binding" that target such scenarios as the provisioned services recovery or re-planning/migration and provide necessary mechanisms for consistent security services provisioning as an important component of the provisioned on-demand infrastructure. The paper also describes the GEMBus (GEANT Multidomain Bus) that is considered as a CSA middleware platform. The presented architecture is the result of the on-going cooperative effort of the two EU projects GEANT3 JRA3 Composable Services and GEYSERS.
This paper presents results of the ongoing development of the CYCLONE as a platform for scientific applications in heterogeneous multi-cloud/multi-provider environment. The paper explains the general use case that provides a general motivation for the CYCLONE architecture and provides detailed analysis of the bioinformatics use cases that define specific requirements to the CYCLONE infrastructure components. Special attention is given to the federated access control and security infrastructure that must provide consistent security and data protection for distributed bioinformatics data processing infrastructure and distributed cross-organisations collaborating teams of scientists. The paper provides information about selected use cases implementation using SlipStream cloud automation and management platform with application recipe example. The paper also addresses requirements for providing dedicated intercloud network infrastructure which is currently not addressed by cloud providers (both public and scientific/community).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.