Trigger-Action platforms are systems that enable users to easily define, in terms of conditional rules, custom behaviors concerning Internet-of-Things (IoT) devices and web services. Unfortunately, although these tools stimulate the creativity of users in building automation, they may also introduce serious risks for the users. Indeed, trigger-action rules can lead to the possibility of users harming themselves, for example by unintentionally disclosing non-public information, or unwillingly exposing their smart environment to cyber-threats. In this paper, we propose to use Natural Language Processing (NLP) techniques to detect automation rules, defined within Trigger-Action IoT platforms, that potentially violate the security or privacy of the users. The proposed NLP-based models capture the semantic and contextual information of the trigger-action rules by applying classification techniques to different combinations of rule's features. We evaluate the proposed solution with the mainstream trigger-action platform, namely IFTTT, by training the NLP models with a dataset of 76,741 rules labeled by using an ensemble of three semi-supervised learning techniques. The experimental results demonstrate that the model based on BERT (Bidirectional Encoder Representations from Transformers) obtains the highest performances when trained on all features, achieving average Precision and Recall values between 88% and 93%. We also compare the achieved performances with those of a baseline system implementing information flow analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.