We present a processing pipeline for flow-based traffic classification using a machine learning component leveraging Deep Neural Networks (DNNs). The system is trained to predict likely characteristics of real-world traffic flows from a campus network ahead of time, e.g., a flow's throughput or duration. Training and evaluation of DNN models are continuously performed on a flow data stream collected from a university data center. Instead of the common binary classification into "mice" and "elephant" (throughput) or "short-term" and "long-term" (duration) flows, predicted flow characteristics are quantized into three classes. Various communication contexts (subset of network traffic, e.g., only TCP) and flow feature groups (subset of flow features, e.g., only a flow's 5-tuple), which are supported through an enrichment strategy, are considered and investigated. An in-depth description of the data acquisition process, including preprocessing steps and anonymization used to protect sensitive information, is given. Additionally, we employ an accelerated variant of t-distributed Stochastic Neighbor Embedding (t-SNE) to visualize network traffic data. This enables the understanding of traffic characteristics and relations between communication flows at a glance. Furthermore, possible use-cases and a highlevel architecture for flow-based routing scenarios utilizing the developed pipeline are proposed.
We present a study of deep learning applied to the domain of network traffic data forecasting. This is a very important ingredient for network traffic engineering, e.g., intelligent routing, which can optimize network performance, especially in large networks. In a nutshell, we wish to predict, in advance, the bit rate for a transmission, based on low-dimensional connection metadata ("flows") that is available whenever a communication is initiated. Our study has several genuinely new points: First, it is performed on a large dataset (≈50 million flows), which requires a new training scheme that operates on successive blocks of data since the whole dataset is too large for in-memory processing. Additionally, we are the first to propose and perform a more fine-grained prediction that distinguishes between low, medium and high bit rates instead of just "mice" and "elephant" flows. Lastly, we apply state-of-theart visualization and clustering techniques to flow data and show that visualizations are insightful despite the heterogeneous and non-metric nature of the data. We developed a processing pipeline to handle the highly non-trivial acquisition process and allow for proper data preprocessing to be able to apply DNNs to network traffic data. We conduct DNN hyper-parameter optimization as well as feature selection experiments, which clearly show that fine-grained network traffic forecasting is feasible, and that domain-dependent data enrichment and augmentation strategies can improve results. An outlook about the fundamental challenges presented by network traffic analysis (high data throughput, unbalanced and dynamic classes, changing statistics, outlier detection) concludes the article.
Biominerals are organic-inorganic nanocomposites exhibiting remarkable properties due to their unique configuration. Using optical spectroscopy and theoretical modeling, it is shown that the optical properties of a model bioinspired system, an inorganic semiconductor host (Cu 2 O) grown in the presence of amino acids (AAs), are strongly influenced by the latter. The absorption and photoluminescence excitation spectra of Cu 2 O-AAs blue-shift with growing AA content, indicating band gap widening. This is attributed to the void-induced quantum confinement effects. Surprisingly, no such shift occurs in the emission spectra. The theoretical model, assuming an inhomogeneous AA distribution within Cu 2 O-AAs due to compositional disorder, explains the deviating behavior of the photoluminescence. The model predicts that the potential causing the confinement effects becomes a function of the local AA density. It results in a Gaussian band gap distribution that shapes the optical properties of Cu 2 O-AAs. Imitating and harnessing the process of biomineralization can pave the way toward new functional materials.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.