Monitoring is an issue of primary concern in current and next generation networked systems. For example, the objective of sensor networks is to monitor their surroundings for a variety of different applications like atmospheric conditions, wildlife behavior, and troop movements among others. Similarly, monitoring in data networks is critical not only for accounting and management, but also for detecting anomalies and attacks. Such monitoring applications are inherently continuous and distributed, and must be designed to minimize the communication overhead that they introduce. In this context we introduce and study a fundamental class of problems called "thresholded counts" where we must return the aggregate frequency count of an event that is continuously monitored by distributed nodes with a user-specified accuracy whenever the actual count exceeds a given threshold value.In this paper we propose to address the problem of thresholded counts by setting local thresholds at each monitoring node and initiating communication only when the locally observed data exceeds these local thresholds. We explore algorithms in two categories: static thresholds and adaptive thresholds. In the static case, we consider thresholds based on a linear combination of two alternate strategies, and show that there exists an optimal blend of the two strategies that results in minimum communication overhead. We further show that this optimal blend can be found using a steepest descent search. In the adaptive case, we propose algorithms that adjust the local thresholds based on the observed distributions of updated information in the distributed monitoring system. We use extensive simulations not only to verify the accuracy of our algorithms and validate our theoretical results, but also to evaluate the performance of the two approaches. We find that both approaches yield significant savings over the naive approach of performing processing at a centralized location.
With widespread popularity of smart phones, more and more users are accessing the Internet on the go. Understanding mobile user browsing behavior is of great significance for several reasons. For example, it can help cellular (data) service providers (CSPs) to improve service performance, thus increasing user satisfaction. It can also provide valuable insights about how to enhance mobile user experience by providing dynamic content personalization and recommendation, or location-aware services.In this paper, we try to understand mobile user browsing behavior by investigating whether there exists distinct "behavior patterns" among mobile users. Our study is based on real mobile network data collected from a large 3G CSP in North America. We formulate this user behavior profiling problem as a co-clustering problem, i.e., we group both users (who share similar browsing behavior), and browsing profiles (of like-minded users) simultaneously. We propose and develop a scalable co-clustering methodology, Phantom, using a novel hourglass model. The proposed hourglass model first reduces the dimensions of the input data and performs divisive hierarchical co-clustering on the lower dimensional data; it then carries out an expansion step that restores the original dimensions. Applying Phantom to the mobile network data, we find that there exists a number of prevalent and distinct behavior patterns that persist over time, suggesting that user browsing behavior in 3G cellular networks can be captured using a small number of co-clusters. For instance, behavior of most users can be classified as either homogeneous (users with very limited set of browsing interests) or heterogeneous (users with very diverse browsing interests), and such behavior profiles do not change significantly at either short (30-min) or long (6 hour) time scales.
Many research efforts propose the use of flowlevel features (e.g., packet sizes and inter-arrival times) and machine learning algorithms to solve the traffic classification problem. However, these statistical methods have not made the anticipated impact in the real world. We attribute this to two main reasons: (a) training the classifiers and bootstrapping the system is cumbersome, (b) the resulting classifiers have limited ability to adapt gracefully as the traffic behavior changes. In this paper, we propose an approach that is easy to bootstrap and deploy, as well as robust to changes in the traffic, such as the emergence of new applications. The key novelty of our classifier is that it learns to identify the traffic of each application in isolation, instead of trying to distinguish one application from another. This is a very challenging task that hides many caveats and subtleties. To make this possible, we adapt and use subspace clustering, a powerful technique that has not been used before in this context. Subspace clustering allows the profiling of applications to be more precise by automatically eliminating irrelevant features. We show that our approach exhibits very high accuracy in classifying each application on five traces from different ISPs captured between 2005 and 2011. This new way of looking at application classification could generate powerful and practical solutions in the space of traffic monitoring and network management.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.