In Data Science we are concerned with the integration of relevant sciences in observed and empirical contexts. This results in the unification of analytical methodologies, and of observed and empirical data contexts. Given the dynamic nature of convergence, described are the origins and many evolutions of the Data Science theme. The following are covered in this article: the rapidly growing post-graduate university course provisioning for Data Science; a preliminary study of employability requirements, and how past eminent work in the social sciences and other areas, certainly mathematics, can be of immediate and direct relevance and benefit for innovative methodology, and for facing and addressing the ethical aspect of Big Data analytics, relating to data aggregation and scale effects. Associated also with Data Science is how direct and indirect outcomes and consequences of Data Science include decision support and policy making, and both qualitative as well as quantitative outcomes. For such reasons, the importance is noted of how Data Science builds collaboratively on other domains, potentially with innovative methodologies and practice. Further sections point towards some of the most major current research issues.Keywords: Big Data training and learning; company and business requirements; ethics; impact; decision support; data engineering; open data; smart homes; smart cities; IoT
Data Science as the Convergence and Bridging of DisciplinesIn [23] at issue are: parallels between astronomy and Earth science data, methodology transfer, and metadata and ontologies characterized as being crucial. The convergence or bridging of disciplines must address "non-homogeneous observables, and varied spatial, temporal coverage at different resolutions". Then, given computational support, "it is the complexity more than the data volume that proves to be a bigger challenge". Further benefits of this Data Science convergence are termed here tractability and reproducibility. There is discussion in section 2 of [23] of the complexity relating to resolution and distributions. In [26], this is also characterized in terms data of encoding. Plenty of work now emphasizes the importance of p-adic data encoding (binary or ternary when p = 2 or 3), compared with real-valued encoding (m-adic, especially when m = 10).The convergence and bridging of disciplines are emphasized in [23], as follows. "Methodology transfer can almost never be unidirectional. Diverse fields grow by learning tricks employed by other disciplines. The important thing is to abstract data -described by meaningful metadata -and the metadata in turn connected by a good ontology." Further description is at issue in regard to Data Science: "We have described here a few techniques from astroinformatics that are finding use in geoinformatics. There would be many from earth science that space science would do well to emulate. Even other disciplines like bioinformatics provide ample opportunities for methodology transfer and collaboration. With growing data volum...