The prevention of leakage of data has been defined as a process or solution which identifies data that is confidential, tracks the data in a way in which it moves in and out of its enterprise to prevent any unauthorized data disclosure in an intentional or an unintentional manner. As data that is confidential is able to reside on various computing devices and move through several network access points or different types of social networks such as emails. Leakage of emails has been defined as if the email either deliberately or accidentally goes to an addressee to whom it should not be addressed. Data Leak Prevention (DLP) is the technique or product that tries mitigating threats to data leaks. In this work, the technique of clustering will be combined with the frequency of the term or the inverse document frequency in order to identify the right centroids for analysing the various emails that are communicated among members of an organization. Every member will fit in to various topic clusters and one such topic cluster can also comprise of several members in the organization who have not communicated with each other earlier. At the time when a new email is composed, every addressee will be categorized to be a potential leak recipient or one that is legal. Such classification was based on the emails sent among the sender and the receiver and also on their topic clusters. The work had investigated the technique of K-Means clustering and also proposed a Tabu - K-Means (TABU-KM) technique of clustering to identify points of optimal clustering. The proposed TABU-KM optimizes the K-Means clustering. Experimental results demonstrated that the proposed method achieves higher True Positive Rate (TPR) for known and unknown recipient and lower False Positive Rate (FPR) for known and unknown recipient
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.