By website fingerprinting (WF) technologies, local listeners are enabled to track the specific website visited by users through an investigation of the encrypted traffic between the users and the Tor network entry node. The current triplet fingerprinting (TF) technique proved the possibility of small sample WF attacks. Previous research methods only concentrate on extracting the overall features of website traffic while ignoring the importance of website local fingerprinting characteristics for small sample WF attacks. Thus, in the present paper, a deep nearest neighbor website fingerprinting (DNNF) attack technology is proposed. The deep local fingerprinting features of websites are extracted via the convolutional neural network (CNN), and then the k-nearest neighbor (k-NN) classifier is utilized to classify the prediction. When the website provides only 20 samples, the accuracy can reach 96.2%. We also found that the DNNF method acts well compared to the traditional methods in coping with transfer learning and concept drift problems. In comparison to the TF method, the classification accuracy of the proposed method is improved by 2%–5% and it is only dropped by 3% when classifying the data collected from the same website after two months. These experiments revealed that the DNNF is a more flexible, efficient, and robust website fingerprinting attack technology, and the local fingerprinting features of websites are particularly important for small sample WF attacks.
Website fingerprinting attacks allow attackers to determine the websites that users are linked to, by examining the encrypted traffic between the users and the anonymous network portals. Recent research demonstrated the feasibility of website fingerprinting attacks on Tor anonymous networks with only a few samples. Thus, this paper proposes a novel small-sample website fingerprinting attack method for SSH and Shadowsocks single-agent anonymity network systems, which focuses on analyzing homology relationships between website fingerprinting. Based on the latter, we design a Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) attack classification model that achieves 94.8% and 98.1% accuracy in classifying SSH and Shadowsocks anonymous encrypted traffic, respectively, when only 20 samples per site are available. We also highlight that the CNN-BiLSTM model has significantly better migration capabilities than traditional methods, achieving over 90% accuracy when applied on a new set of monitored sites with only five samples per site. Overall, our experiments demonstrate that CNN-BiLSTM is an efficient, flexible, and robust model for website fingerprinting attack classification.
Deep learning is successful in providing adequate classification results in the field of traffic classification due to its ability to characterize features. However, malicious traffic captures insufficient data and identity tags, which makes it difficult to reach the data volume required to drive deep learning. The problem of classifying small-sample malicious traffic has gradually become a research hotspot. This paper proposes a small-sample malicious traffic classification method based on deep transfer learning. The proposed DA-Transfer method significantly improves the accuracy and efficiency of the small-sample malicious traffic classification model by integrating both data and model transfer adaptive modules. The data adaptation module promotes the consistency of the distribution between the source and target datasets, which improves the classification performance by adaptive training of the prior model. In addition, the model transfer adaptive module recommends the transfer network structure parameters, which effectively improves the network training efficiency. Experiments show that the average classification accuracy of the DA-Transfer method reaches 93.01% on a small-sample dataset with less than 200 packets per class. The training efficiency of the DA-Transfer model is improved by 20.02% compared to traditional transfer methods.
Deep learning has achieved good classification results in the field of traffic classification in recent years due to its good feature representation ability. However, the existing traffic classification technology cannot meet the requirements for the incremental learning of tasks in online scenarios. In addition, due to the high concealment and fast update speed of malicious traffic, the number of labeled samples that can be captured is scarce, and small samples cannot drive neural network training, resulting in poor performance of the classification model. Therefore, this paper proposes an incremental learning method for small-sample malicious traffic classification. The method uses the pruning strategy to find the redundant network structure and dynamically allocates redundant neurons for training based on the proposed measurement method according to the difficulty of the new class. This enables the network to perform incremental learning without excessively consuming storage and computing resources, and reasonable allocation improves the classification accuracy of new classes. At the same time, through the knowledge transfer method, the model can reduce the catastrophic forgetting of the old class, relieve the pressure of training large parameters with small-sample data, and improve the model classification performance. Experiments involving multiple datasets and settings show that our method is superior to the established baseline in terms of classification accuracy, consuming 50% less memory.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.