Mobile devices can be maliciously exploited to violate the privacy of people. In most attack scenarios, the adversary takes the local or remote control of the mobile device, by leveraging a vulnerability of the system, hence sending back the collected information to some remote web service. In this paper, we consider a different adversary, who does not interact actively with the mobile device, but he is able to eavesdrop the network traffic of the device from the network side (e.g., controlling a Wi-Fi access point). The fact that the network traffic is often encrypted makes the attack even more challenging. In this paper, we investigate to what extent such an external attacker can identify the specific actions that a user is performing on her mobile apps. We design a system that achieves this goal using advanced machine learning techniques. We built a complete implementation of this system, and we also run a thorough set of experiments, which show that our attack can achieve accuracy and precision higher than 95%, for most of the considered actions. We compared our solution with the three state-of-the-art algorithms, and confirming that our system outperforms all these direct competitors
Abstract-It is generally recognized that the traffic generated by an individual connected to a network acts as his biometric signature. Several tools exploit this fact to fingerprint and monitor users. Often, though, these tools assume to access the entire traffic, including IP addresses and payloads. This is not feasible on the grounds that both performance and privacy would be negatively affected. In reality, most ISPs convert user traffic into NetFlow records for a concise representation that does not include, for instance, any payloads. More importantly, large and distributed networks are usually NAT'd, thus a few IP addresses may be associated to thousands of users. We devised a new fingerprinting framework that overcomes these hurdles. Our system is able to analyze a huge amount of network traffic represented as NetFlows, with the intent to track people. It does so by accurately inferring when users are connected to the network and which IP addresses they are using, even though thousands of users are hidden behind NAT. Our prototype implementation was deployed and tested within an existing large metropolitan WiFi network serving about 200,000 users, with an average load of more than 1,000 users simultaneously connected behind 2 NAT'd IP addresses only. Our solution turned out to be very effective, with an accuracy greater than 90%. We also devised new tools and refined existing ones that may be applied to other contexts related to NetFlow analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.