Big Data systems are often composed of information extraction, preprocessing, processing, ingestion and integration, data analysis, interface and visualization components. Different big data systems will have different requirements and as such apply different architecture design configurations. Hence a proper architecture for the big data system is important to achieve the provided requirements. Yet, although many different concerns in big data systems are addressed the notion of architecture seems to be more implicit. In this paper we aim to discuss the software architectures for big data systems considering architectural concerns of the stakeholders aligned with the quality attributes. A systematic literature review method is followed implementing a multiple-phased study selection process screening the literature in significant journals and conference proceedings.
Big Data has become a very important driver for innovation and growth for various industries such as health, administration, agriculture, defence, and education. Storing and analysing large amounts of data are becoming increasingly common in many of these application areas. In general, different application domains might require different type of big data systems. Although, lot has been written on big data it is not easy to identify the required features for developing big data systems that meets the application requirements and the stakeholder concerns. In this paper we provide a survey of big data systems based on feature modelling which is a technique that is utilized for defining the common and variable features of a domain. The feature model has been derived following an extensive literature study on big data systems. We present the feature model and discuss the features to support the understanding of big data systems.
Summary
Malicious software forms a threat to many software‐intensive systems and as such several malware detection approaches have been introduced, often based on sequential data analysis. Long short‐term memory (LSTM) is an artificial recurrent neural network (RNN) architecture that is effective for sequential data analysis, however, no study has yet analyzed the performance of different LSTM architectures for the application of malware detection. In this article, we aim to evaluate and benchmark the performance of LSTM‐based malware detection approaches on specific LSTM architectures to provide insight into malware detection. Our method builds LSTM‐based malware prediction models and performs experiments using different LSTM architectures including Vanilla LSTM, stacked LSTM, bi‐directional LSTM, and CNN‐LSTM. We evaluated the performance of each of these architectures and different configurations. Our study, as a contribution, shows that Bidirectional LSTM with hyperparameter optimization is found to be overperforming other selected LSTM architectures. This study shows that different LSTM approaches and architectures are applicable to the malware detection problem. Quality attributes such as efficiency and accuracy, and the software system architecture adopted for the implementation impact the selection of the LSTM approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.