The broad context of this literature review is the connected manufacturing enterprise, characterized by a data environment such that the size, structure and variety of information strain the capability of traditional software and database tools to effectively capture, store, manage and analyze it. This paper surveys and discusses representative examples of existing research into approaches for feature set reduction in the big data environment, focusing on three contexts: general industrial applications; specific industrial applications such as fault detection or fault prediction; and data reduction. The conclusion from this review is that there is room for research into frameworks or approaches to feature filtration and prioritization, specifically with respect to providing quantitative or qualitative information about the individual features in the dataset that can be used to rank features against each other. A byproduct of this gap is a tendency for analysts not to holistically generalize results beyond the specific problem of interest, and, related, for manufacturers to possess only limited knowledge of the relative value of smart manufacturing data collected.
Emotion classification can be a powerful tool to derive narratives from social media data. Traditional machine learning models that perform emotion classification on Indonesian Twitter data exist but rely on closed-source features. Recurrent neural networks can meet or exceed the performance of state-of-the-art traditional machine learning techniques using exclusively open-source data and models. Specifically, these results show that recurrent neural network variants can produce more than an 8% gain in accuracy in comparison with logistic regression and SVM techniques and a 15% gain over random forest when using FastText embeddings. This research found a statistical significance in the performance of a single-layer bidirectional long short-term memory model over a two-layer stacked bidirectional long short-term memory model. This research also found that a single-layer bidirectional long short-term memory recurrent neural network met the performance of a state-of-the-art logistic regression model with supplemental closed-source features from a study by Saputri et al. [8] when classifying the emotion of Indonesian tweets.
Rockwell Automation, a global manufacturing and consultation corporation headquartered in Milwaukee, WI, employs a term, The Connected Enterprise (CE), to describe its strategy of corporate shared vision for the future of industrial automation [1]. CE strategies address the problem of disconnect by linking people, equipment, and processes for real-time learning of enterprise status in order to enable informed, adaptive, and proactive decisions [2]. The term "big data" may be loosely defined as information such that the size, structure, or variety strain the capability of traditional software or database tools to capture, store, manage, and analyze it [3, 4]. Not only does big data pose a challenge to software systems and tools, but also the volume of data challenges the ability of human operators, analysts, and leaders to grasp, consume, and understand the critical pieces. Research in fields as diverse as psychology [5], economics [6], and literature [7] have identified limitations in human ability to process, visualize, and synthesize meaning from data. George
Effective employment of social media for any social influence outcome requires a detailed understanding of the target audience. Social media provides a rich repository of self-reported information that provides insight regarding the sentiments and implied priorities of an online population. Using Social Network Analysis, this research models user interactions on Twitter as a weighted, directed network. Topic modeling through Latent Dirichlet Allocation identifies the topics of discussion in Tweets, which this study uses to induce a directed multilayer network wherein users (in one layer) are connected to the conversations and topics (in a second layer) in which they have participated, with inter-layer connections representing user participation in conversations. Analysis of the resulting network identifies both influential users and highly connected groups of individuals, informing an understanding of group dynamics and individual connectivity. The results demonstrate that the generation of a topically-focused social network to represent conversations yields more robust findings regarding influential users, particularly when analysts collect Tweets from a variety of discussions through more general search queries. Within the analysis, PageRank performed best among four measures used to rank individual influence within this problem context. In contrast, the results of applying both the Greedy Modular Algorithm and the Leiden Algorithm to identify communities were mixed; each method yielded valuable insights, but neither technique was uniformly superior. The demonstrated four-step process is readily replicable, and an interested user can automate the process with relatively low effort or expense.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.