Purpose – The purpose of this paper is to identify and describe the most prominent research areas connected with “Big Data” and propose a thorough definition of the term. Design/methodology/approach – The authors have analysed a conspicuous corpus of industry and academia articles linked with Big Data to find commonalities among the topics they treated. The authors have also compiled a survey of existing definitions with a view of generating a more solid one that encompasses most of the work happening in the field. Findings – The main themes of Big Data are: information, technology, methods and impact. The authors propose a new definition for the term that reads as follows: “Big Data is the Information asset characterized by such a High Volume, Velocity and Variety to require specific Technology and Analytical Methods for its transformation into Value.” Practical implications – The formal definition that is proposed can enable a more coherent development of the concept of Big Data, as it solely relies on the essential strands of current state-of-the-art and is coherent with the most popular definitions currently used. Originality/value – This is among the first structured attempts of building a convincing definition of Big Data. It also contains an original exploration of the topic in connection with library management.
The rapid expansion of Big Data Analytics is forcing companies to rethink their Human Resource (HR) needs. However, at the same time, it is unclear which types of job roles and skills constitute this area. To this end, this study pursues to drive clarity across the heterogeneous nature of skills required in Big Data professions, by analyzing a large amount of real-world job posts published online. More precisely we: 1) identify four Big Data 'job families'; 2) recognize nine homogeneous groups of Big Data skills (skill sets) that are being demanded by companies; 3) characterize each job family with the appropriate level of competence required within each Big Data skill set. We propose a novel, semi-automated, fully replicable, analytical methodology based on a combination of machine learning algorithms and expert judgement. Our analysis leverages a significant amount of online job posts, obtained through web scraping, to generate an intelligible classification of job roles and skill sets. The results can support business leaders and HR managers in establishing clear strategies for the acquisition and the development of the right skills needed to leverage Big Data at best. Moreover, the structured classification of job families and skill sets will help establish a common dictionary to be used by HR recruiters and education providers, so that supply and demand can more effectively meet in the job marketplace.
As the Industrial Internet of Things (IIoT a ) grows, systems are increasingly being monitored by arrays of sensors returning time-series data at ever-increasing 'volume, velocity and variety' b (i.e. Industrial Big Data c ). An obvious use for these data is real-time systems condition monitoring and prognostic time to failure analysis (remaining useful life, RUL). (e.g. See white papers by Senseye.io, Prognostics -The Future of Condition Monitoring d , and output of the NASA Prognostics Center of Excellence (PCoE e ).) However, as noted by Agrawal and Choudhary f 'Our ability to collect "big data" has greatly surpassed our capability to analyze it, underscoring the emergence of the fourth paradigm of science, which is data-driven discovery.' In order to fully utilize the potential of Industrial Big Data we need data-driven techniques that operate at scales that process models cannot. Here we present a prototype technique for data-driven anomaly detection to operate at industrial scale. The method generalizes to application with almost any multivariate dataset based on independent ordinations of repeated (bootstrapped) partitions of the dataset and inspection of the joint distribution of ordinal distances. a
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.