Accurate stock market prediction is of great interest to investors; however, stock markets are driven by volatile factors such as microblogs and news that make it hard to predict stock market index based on merely the historical data. The enormous stock market volatility emphasizes the need to effectively assess the role of external factors in stock prediction. Stock markets can be predicted using machine learning algorithms on information contained in social media and financial news, as this data can change investors' behavior. In this paper, we use algorithms on social media and financial news data to discover the impact of this data on stock market prediction accuracy for ten subsequent days. For improving performance and quality of predictions, feature selection and spam tweets reduction are performed on the data sets. Moreover, we perform experiments to find such stock markets that are difficult to predict and those that are more influenced by social media and financial news. We compare results of different algorithms to find a consistent classifier. Finally, for achieving maximum prediction accuracy, deep learning is used and some classifiers are ensembled. Our experimental results show that highest prediction accuracies of 80.53% and 75.16% are achieved using social media and financial news, respectively. We also show that New York and Red Hat stock markets are hard to predict, New York and IBM stocks are more influenced by social media, while London and Microsoft stocks by financial news. Random forest (RF) classifier is found to be consistent and highest accuracy of 83.22% is achieved by its ensemble.
In Content-Centric Networks (CCNs) as a possible future Internet, new kinds of attacks and security challenges -from Denial of Service (DoS) to privacy attacks-will arise. An efficient and effective security mechanism is required to secure content and defense against unknown and new forms of attacks and anomalies. Usually, clustering algorithms would fit the requirements for building a good anomaly detection system. K-means is a popular anomaly detection method to classify data into different categories. However, it suffers from the local convergence and sensitivity to selection of the cluster centroids. In this paper, we present a novel fuzzy anomaly detection system that works in two phases. In the first phase -the training phase-we propose an hybridization of Particle Swarm Optimization (PSO) and K-means algorithm with two simultaneous cost functions as well-separated clusters and local optimization to determine the optimal number of clusters. When the optimal placement of clusters centroids and objects are defined, it starts the second phase. In this phase -the detection phase-we employ a fuzzy approach by the combination of two distance-based methods as classification and outlier to detect anomalies in new monitoring data. Experimental results demonstrate that the proposed algorithm can achieve to the optimal number of clusters, well-separated clusters, as well as increase the high detection rate and decrease the false positive rate at the same time when compared to some other well-known clustering algorithms.
Over the last several years, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) has been widely applied in many areas of science due to its simplicity, robustness against noise (outlier) and ability to discover clusters of arbitrary shapes. However, DBSCAN algorithm requires two initial input parameters, namely Eps (the radius of the cluster) and MinPts (the minimum data objects required inside the cluster) which both have a significant influence on the clustering results. Hence, DB-SCAN is sensitive to its input parameters and it is hard to determine them a priori. This paper presents an efficient and effective hybrid clustering method, named BDE-DBSCAN, that combines Binary Differential Evolution and DBSCAN algorithm to simultaneously quickly and automatically specify appropriate parameter values for Eps and MinPts. Since the Eps parameter can largely degrades the efficiency of the DBSCAN algorithm, the combination of an analytical way for estimating Eps and Tournament Selection (TS) method is also employed. Experimental results indicate the proposed method is precise in determining appropriate input parameters of DBSCAN algorithm.
Named Data Networking (NDN) is a candidate next-generation Internet architecture designed to overcome the fundamental limitations of the current IP-based Internet, in particular strong security. The ubiquitous in-network caching is a key NDN feature. However, pervasive caching strengthens security problems namely cache pollution attacks including cache poisoning (i.e., introducing malicious content into caches as false-locality) and cache pollution (i.e., ruining the cache locality with new unpopular content as locality-disruption). In this paper, a new cache replacement method based on Adaptive Neuro-Fuzzy Inference System (ANFIS) is presented to mitigate the cache pollution attacks in NDN. The ANFIS structure is built using the input data related to the inherent characteristics of the cached content and the output related to the content type (i.e., healthy, localitydisruption, and false-locality). The proposed method detects both false-locality and locality-disruption attacks as well as a combination of the two on different topologies with high accuracy, and mitigates them efficiently without very much computational cost as compared to the most common policies.
Abnormal network traffic analysis through Intrusion Detection Systems (IDSs) and visualization techniques has considerably become an important research topic to protect computer networks from intruders. It has been still challenging to design an accurate and a robust IDS with visualization capabilities to discover security threats due to the high volume of network traffic. This research work introduces and describes a novel anomaly-based intrusion detection system in presence of long-range independence data called benign outliers, using a neural projection architecture by a modified Self-Organizing Map (SOM) to not only detect attacks and anomalies accurately, but also provide visualized information and insights to end users. The proposed approach enables better analysis by merging the large amount of network traffic into an easy-to-understand 2D format and a simple user interaction. To show the performance and validate the proposed visualization-based IDS, it has been trained and tested over synthetic and real benchmarking datasets (NSL-KDD, UNSW-NB15, AAGM and VPN-nonVPN) that are widely applied in this domain. The results of the conducted experimental study confirm the advantages and effectiveness of the proposed approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.