We propose a hybrid approach to temporal anomaly detection in access data of users to databases -or more generally, any kind of subject-object co-occurrence data. We consider a highdimensional setting that also requires fast computation at test time. Our methodology identifies anomalies based on a single stationary model, instead of requiring a full temporal one, which would be prohibitive in this setting. We learn a low-rank stationary model from the training data, and then fit a regression model for predicting the expected likelihood score of normal access patterns in the future. The disparity between the predicted likelihood score and the observed one is used to assess the "surprise" at test time. This approach enables calibration of the anomaly score, so that time-varying normal behavior patterns are not considered anomalous. We provide a detailed description of the algorithm, including a convergence analysis, and report encouraging empirical results. One of the data sets that we tested, TDA, is new for the public domain. It consists of two months' worth of database access records from a live system. Our code is publicly available at https://github. com/eyalgut/TLR_anomaly_detection.git. The TDA data set is available at https://www.kaggle.com/ eyalgut/binary-traffic-matrices.
Database activity monitoring systems aim to protect organizational data by logging users' activity to Identify and document malicious activity. High-velocity streams and operating costs, restrict these systems to examining only a sample of the activity. Current solutions use manual policies to decide which transactions to monitor. This limits the diversity of the data collected, creating a "filter bubble" over representing specific subsets of the data such as high-risk users and underrepresenting the rest of the population which may never be sampled. In recommendation systems, Bandit algorithms have recently been used to address this problem. We propose addressing the sampling for database activity monitoring problem as a recommender system. In this work, we redefine the data sampling problem as a special case of the multi-armed bandit problem and present a novel algorithm, C--Greedy, which combines expert knowledge with random exploration. We analyze the effect of diversity on coverage and downstream event detection using simulated data. In doing so, we find that adding diversity to the sampling using the bandit-based approach works well for this task, maximizing population coverage without decreasing the quality in terms of issuing alerts about events, and outperforming policies manually crafted by experts and other sampling methods.
No abstract
Visualization tools are critical components of cyber security systems allowing analyzers to better understand, detect and prevent security breaches. Security administrators need to understand which users accessed the database and what operations were performed in order to detect irregularities. The current work compares the Sankey diagram with the more commonly used node-link diagram as an alternative visualization technique for cyber security tasks in a controlled experiment. The results indicate, that the Sankey tool showed a consistent advantage in task completion time and was more effective (measured by the percent of correct answers) in synoptic tasks, while the Node-link diagram was more effective in basic, elementary tasks. Further results revealed that performance had only a small effect on user satisfaction and preferences. Our results suggest that the Sankey tool may be a viable option for cyber security visualization tools and strengthens the need to provide personalized visualization tools based on user preferences.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.