IntroductionInformation extraction (IE) process extracts useful structured information from the unstructured data in the form of entities, relations, objects, events and many other types. The extracted information from unstructured data is used to prepare data for analysis. Therefore, the efficient and accurate transformation of unstructured data in the IE process improves the data analysis. Numerous techniques have been introduced for different data types i.e. text, image, audio, and video.The advancement in technology promoted the rapid growth of data volume in recent years. The volume, variety (structured, unstructured, and semi-structured data) and velocity of big data have also changed the paradigm of computational capabilities of the systems. IBM estimated that more than 2.5 quintillion bytes of data are generated every Abstract Process of information extraction (IE) is used to extract useful information from unstructured or semi-structured data. Big data arise new challenges for IE techniques with the rapid growth of multifaceted also called as multidimensional unstructured data. Traditional IE systems are inefficient to deal with this huge deluge of unstructured big data. The volume and variety of big data demand to improve the computational capabilities of these IE systems. It is necessary to understand the competency and limitations of the existing IE techniques related to data pre-processing, data extraction and transformation, and representations for huge volumes of multidimensional unstructured data. Numerous studies have been conducted on IE, addressing the challenges and issues for different data types such as text, image, audio and video. Very limited consolidated research work have been conducted to investigate the task-dependent and task-independent limitations of IE covering all data types in a single study. This research work address this limitation and present a systematic literature review of state-of-the-art techniques for a variety of big data, consolidating all data types. Recent challenges of IE are also identified and summarized. Potential solutions are proposed giving future research directions in big data IE. The research is significant in terms of recent trends and challenges related to big data analytics. The outcome of the research and recommendations will help to improve the big data analytics by making it more productive. which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Adnan and Akbar J Big Data (2019) 6:91 Malaysia Adnan and Akbar J Big Data (2019) 6:91 RESEARCHday. Among these statistics, it was also predicted that unstructured data from diverse sources will grow up to 90% in few years. IDC estimated that unstructured data will be 95% of the global data in 2020 with estimated 65% annual growth rate [1]. The common characteristics of unstructured data are, (i) it comes in multiple formats...
During the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information from this large variety of unstructured data. Several techniques and methods have been presented for IE from unstructured data. However, numerous studies conducted on IE from a variety of unstructured data are limited to single data types such as text, image, audio, or video. This article reviews the existing IE techniques along with its subtasks, limitations, and challenges for the variety of unstructured data highlighting the impact of unstructured big data on IE techniques. To the best of our knowledge, there is no comprehensive study conducted to investigate the limitations of existing IE techniques for the variety of unstructured big data. The objective of the structured review presented in this article is twofold. First, it presents the overview of IE techniques from a variety of unstructured data such as text, image, audio, and video at one platform. Second, it investigates the limitations of these existing IE techniques due to the heterogeneity, dimensionality, and volume of unstructured big data. The review finds that advanced techniques for IE, particularly for multifaceted unstructured big data sets, are the utmost requirement of the organizations to manage big data and derive strategic information. Further, potential solutions are also presented to improve the unstructured big data IE systems for future research. These solutions will help to increase the efficiency and effectiveness of the data analytics process in terms of context-aware analytics systems, data-driven decision-making, and knowledge management.
Masalah dalam penelitian ini adalah kurangnya variasi mengajar guru sehingga membuat motivasi belajar siswa menurun.Penelitian ini merupakan penelitian kuantitatif jenis korelasional yang bertujuan untuk mengetahui gambaran variasi mengajar guru Sekolah Dasar, mengetahui gambaran motivasi belajar siswa Sekolah Dasar, dan hubungan antara variasi mengajar guru dengan motivasi belajar siswa Sekolah Dasar. Hasil analisis statistik deskriptif diperoleh gambaran Variasi Mengajar guru Sekolah Dasarberada pada kategori sedang dengan persentase 85% dan gambaran motivasi belajar siswa Sekolah Dasarberada pada kategori sedang dengan persentase 69%. Dari hasil analisis statistik inferensial diperoleh rhitung0,463 ≥ rtabel 0,2133dengan signifikansi 5%. Dapat disimpulkan bahwa terdapat hubungan yang signifikan antara variasi mengajar guru dengan motivasi belajar siswa Sekolah Dasar.
Unstructured text contains valuable information for a range of enterprise applications and informed decision making. Text analytics is used to extract valuable insights from unstructured big data. Among the most significant challenges of text analytics, quality and usability are critical affecting the outcome of the analytical process. The enhancement in usability is important for the exploitation of unstructured data. Most of the existing literature focuses on the usability of structured data as compared to unstructured data whereas big data usability has been discussed merely in context of its assessment. The existing approaches do not provide proper guidelines on usability enhancement of unstructured data. In this study, a rigorous systematic literature review, using PRISMA framework, has been conducted to develop a model enhancing the usability of unstructured data and bridging the research gap. The recent approaches and solutions for text analytics have been investigated thoroughly. Furthermore, it identifies the usability issues of unstructured text data and their consequences on data preparation for analytics. Defining the usability dimensions for unstructured big data, identification of the usability determinants, and developing a relationship between usability dimension and determinants to derive usability rules are the significant contribution of this research, and are integrated to formulate the model. The proposed usability enhancement model is the major outcome of the study. It would contribute to making unstructured data usable and facilitating the data preparation activities with more valuable data that eventually improves the analytical process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.