With the Internet of Things (IoT) generating vast amounts of data,
privacy breaches have become increasingly prevalent, exposing
individuals to serious risks such as identity theft and life-threatening
situations. This research addresses the challenge of identifying
cybersecurity threats and vulnerabilities leading to privacy breaches,
as evidenced by recent cyber-attacks on Australian Medibank, Optus, and
hospital networks. We propose a machine learning (ML)-based approach to
distinguish between legitimate and rogue privacy policies, defining
fundamental concepts of privacy, security, and access control in the
context of personal, confidential, and sensitive information breaches.
Our methodology introduces zero-privacy (ZP) and binary question-answer
(QA) models to discern legitimate versus illegitimate actions or
interests within privacy policies. Our experiments utilise natural
language processing (NLP)-based ML models to analyse the linguistics of
privacy policies. In experiments conducted on a dataset from the top 100
Forbes-listed companies, including 67 rogue policies, our privacy
classification approach demonstrates reliability, accurately
distinguishing between legitimate and rogue policies. With a dataset
split of 90% for training and 10% for testing, our model achieves
accuracy and precision exceeding 94% and 91%, respectively.
Additionally, we evaluate the probability of ZP occurrences in
organisations’ privacy and service-level agreements, revealing
significant privacy breach risks. Through case studies utilising our
proposed binary QA model, we underscore the urgent need for enhanced
privacy measures across various organisations’ policies. Introducing a
novel approach to access control, we specify permissions under
conditions of legitimate and rogue privacy policies, exemplifying the
applicability of our proposed access control mechanism through security
policy modelling.