Collaborative information systems (CISs) are deployed within a diverse array of environments that manage sensitive information. Current security mechanisms detect insider threats, but they are ill-suited to monitor systems in which users function in dynamic teams. In this paper, we introduce the community anomaly detection system (CADS), an unsupervised learning framework to detect insider threats based on the access logs of collaborative environments. The framework is based on the observation that typical CIS users tend to form community structures based on the subjects accessed (e.g., patients’ records viewed by healthcare providers). CADS consists of two components: 1) relational pattern extraction, which derives community structures and 2) anomaly prediction, which leverages a statistical model to determine when users have sufficiently deviated from communities. We further extend CADS into MetaCADS to account for the semantics of subjects (e.g., patients’ diagnoses). To empirically evaluate the framework, we perform an assessment with three months of access logs from a real electronic health record (EHR) system in a large medical center. The results illustrate our models exhibit significant performance gains over state-of-the-art competitors. When the number of illicit users is low, MetaCADS is the best model, but as the number grows, commonly accessed semantics lead to hiding in a crowd, such that CADS is more prudent.
Modern healthcare organizations (HCOs) are composed of complex dynamic teams to ensure clinical operations are executed in a quick and competent manner. At the same time, the fluid nature of such environments hinders administrators' efforts to define access control policies that appropriately balance patient privacy and healthcare functions. Manual efforts to define these policies are labor-intensive and error-prone, often resulting in systems that endow certain care providers with overly broad access to patients' medical records while restricting other providers from legitimate and timely use. In this work, we propose an alternative method to generate these policies by automatically mining usage patterns from electronic health record (EHR) systems. EHR systems are increasingly being integrated into clinical environments and our approach is designed to be generalizable across HCOs, thus assisting in the design and evaluation of local access control policies. Our technique, which is grounded in data mining and social network analysis theory, extracts a statistical model of the organization from the access logs of its EHRs. In doing so, our approach enables the review of predefined policies, as well as the discovery of unknown behaviors. We evaluate our approach with five months of access logs from the Vanderbilt University Medical Center and confirm the existence of stable social structures and intuitive business operations. Additionally, we demonstrate that there is significant turnover in the interactions between users in the HCO and that policies learned at the department level afford greater stability over time.
Objective Models of healthcare organizations (HCOs) are often defined up front by a select few administrative officials and managers. However, given the size and complexity of modern healthcare systems, this practice does not scale easily. The goal of this work is to investigate the extent to which organizational relationships can be automatically learned from utilization patterns of electronic health record (EHR) systems. Method We designed an online survey to solicit the perspectives of employees of a large academic medical center. We surveyed employees from two administrative areas: 1) Coding & Charge Entry and 2) Medical Information Services and two clinical areas: 3) Anesthesiology and 4) Psychiatry. To test our hypotheses we selected two administrative units that have work-related responsibilities with electronic records; however, for the clinical areas we selected two disciplines with very different patient responsibilities and whose accesses and people who accessed were similar. We provided each group of employees with questions regarding the chance of interaction between areas in the medical center in the form of association rules (e.g., Given someone from Coding & Charge Entry accessed a patient’s record, what is the chance that someone from Medical Information Services access the same record?). We compared the respondent predictions with the rules learned from actual EHR utilization using linear-mixed effects regression models. Results The findings from our survey confirm that medical center employees can distinguish between association rules of high and non-high likelihood when their own area is involved. Moreover, they can make such distinctions between for any HCO area in this survey. It was further observed that, with respect to highly likely interactions, respondents from certain areas were significantly better than other respondents at making such distinctions and certain areas’ associations were more distinguishable than others. Conclusions These results illustrate that EHR utilization patterns may be consistent with the expectations of HCO employees. Our findings show that certain areas in the HCO are easier than others for employees to assess, which suggests that automated learning strategies may yield more accurate models of healthcare organizations than those based on the perspectives of a select few individuals.
Collaborative information systems (CIS) enable users to coordinate efficiently over shared tasks in complex distributed environments. For flexibility, they provide users with broad access privileges, which, as a side-effect, leave such systems vulnerable to various attacks. Some of the more damaging malicious activities stem from internal misuse, where users are authorized to access system resources. A promising class of insider threat detection models for CIS focuses on mining access patterns from audit logs, however, current models are limited in that they assume organizations have significant resources to generate label cases for training classifiers or assume the user has committed a large number of actions that deviate from “normal” behavior. In lieu of the previous assumptions, we introduce an approach that detects when specific actions of an insider deviate from expectation in the context of collaborative behavior. Specifically, in this paper, we introduce a specialized network anomaly detection model, or SNAD, to detect such events. This approach assesses the extent to which a user influences the similarity of the group of users that access a particular record in the CIS. From a theoretical perspective, we show that the proposed model is appropriate for detecting insider actions in dynamic collaborative systems. From an empirical perspective, we perform an extensive evaluation of SNAD with the access logs of two distinct environments: the patient record access logs a large electronic health record system (6,015 users, 130,457 patients and 1,327,500 accesses) and the editing logs of Wikipedia (2,394,385 revisors, 55,200 articles and 6,482,780 revisions). We compare our model with several competing methods and demonstrate SNAD is significantly more effective: on average it achieves 20–30% greater area under an ROC curve.
Collaborative information systems (CIS) enable users to coordinate efficiently over shared tasks. T hey are often deployed in complex dynamic systems that provide users with broad access privileges, but also leave the system vulnerable to various attacks. Techniques to detect threats originating from beyond the system are relatively mature, but methods to detect insider threats are still evolving. A promising class of insider threat detection models for CIS focus on the communities that manifest between users based on the usage of common subjects in the system. However, current methods detect only when a user’s aggregate behavior is intruding, not when specific actions have deviated from expectation. In this paper, we introduce a method called specialized network anomaly detection (SNAD) to detect such events. SNAD assembles the community of users that access a particular subject and assesses if similarities of the community with and without a certain user are sufficiently different. We present a theoretical basis and perform an extensive empirical evaluation with the access logs of two distinct environments: those of a large electronic health record system (6,015 users, 130,457 patients and 1,327,500 accesses) and the editing logs of Wikipedia (2,388,955 revisors, 55,200 articles and 6,482,780 revisions). We compare SNAD with several competing methods and demonstrate it is significantly more effective: on average it achieves 20–30% greater area under an ROC curve.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.