Tor traffic tracking is valuable for combating cybercrime as it provides insights into the traffic active on the Tor network. Tor-based application traffic classification is one of the tracking methods, which can effectively classify Tor application services. However, it is not effective in classifying specific applications due to more complicated traffic patterns in the spatial and temporal dimensions. As a solution, the authors propose FlowMFD, a novel Tor-based application traffic classification approach using amount-frequencydirection (MFD) chromatographic features and spatial-temporal modelling. Expressly, FlowMFD mines the interaction pattern between Tor applications and servers by analysing the time series features (TSFs) of different size packets. Then MFD chromatographic features (MFDCF) are designed to represent the pattern. Those features integrate multiple low-dimensional TSFs into a single plane and retain most pattern information. In addition, FlowMFD utilises a cascaded model with a two-dimensional convolutional neural network (2D-CNN) and a bidirectional gated recurrent unit to capture spatialtemporal dependencies between MFDCF. The authors evaluate FlowMFD under the public ISCXTor2016 dataset and the self-collected dataset, where we achieve an accuracy of 92.1% (4.2%↑) and 88.3% (4.5%↑), respectively, outperforming state-of-the-art comparison methods.
The anonymous system Tor uses an asymmetric algorithm to protect the content of communications, allowing criminals to conceal their identities and hide their tracks. This malicious usage brings serious security threats to public security and social stability. Statistical analysis of traffic flows can effectively identify and classify Tor flow. However, few features can be extracted from Tor traffic, which have a weak representational ability, making it challenging to combat cybercrime in real-time effectively. Extracting and utilizing more accurate features is the key point to improving the real-time detection performance of Tor traffic. In this paper, we design an efficient and real-time identification scheme for Tor traffic based on the time window method and bidirectional statistical characteristics. In this paper, we divide the network traffic by sliding the time window and then calculate the relative entropy of the flows in the time window to identify Tor traffic. We adopt a sequential pattern mining method to extract bidirectional statistical features and classify the application types in the Tor traffic. Finally, extensive experiments are carried out on the UNB public dataset (ISCXTor2016) to validate our proposal’s effectiveness and real-time property. The experiment results show that the proposed method can detect Tor flow and classify Tor flow types with an accuracy of 93.5% and 91%, respectively, and the speed of processing and classifying a single flow is 0.05 s, which is superior to the state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.