The Internet-of-Things (IoT) has brought in new challenges in device identification -what the device is, and authentication -is the device the one it claims to be. Traditionally, the authentication problem is solved by means of a cryptographic protocol. However, the computational complexity of cryptographic protocols and/or scalability problems related to key management, render almost all cryptography based authentication protocols impractical for IoT. The problem of device identification is, on the other hand, sadly neglected. We believe that device fingerprinting can be used to solve both these problems effectively. In this work, we present a methodology to perform device behavioral fingerprinting that can be employed to undertake device type identification. A device behavior is approximated using features extracted from the network traffic of the device. These features are used to train a machine learning model that can be used to detect similar device types. We validate our approach using five-fold cross validation; we report a identification rate of 86-99% and a mean accuracy of 99%, across all our experiments. Our approach is successful even when a device uses encrypted communication. Furthermore, we show preliminary results for fingerprinting device categories, i.e., identifying different device types having similar functionality.
Phishing websites trick users into believing that they are interacting with a legitimate website, and thereby, capture sensitive information, such as user names, passwords, credit card numbers and other personal information. Machine learning appears to be a promising technique for distinguishing between phishing websites and legitimate ones. However, machine learning approaches are susceptible to adversarial learning techniques, which attempt to degrade the accuracy of a trained classifier model. In this work, we investigate the robustness of machine learning based phishing detection in the face of adversarial learning techniques. We propose a simple but effective approach to simulate attacks by generating adversarial samples through direct feature manipulation. We assume that the attacker has limited knowledge of the features, the learning models, and the datasets used for training. We conducted experiments on four publicly available datasets on the Internet. Our experiments reveal that the phishing detection mechanisms are vulnerable to adversarial learning techniques. Specifically, the identification rate for phishing websites dropped to 70% by manipulating a single feature. When four features were manipulated, the identification rate dropped to zero percent. This result means that, any phishing sample, which would have been detected correctly by a classifier model, can bypass the classifier by changing at most four feature values; a simple effort for an attacker for such a big reward. We define the concept of vulnerability level for each dataset that measures the number of features that can be manipulated and the cost for each manipulation. Such a metric will allow us to compare between multiple defense models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.