In recent years, web servers pay attention to privacy and anonymity protection and choose to rely on hidden service to avoid exposure of the real geographic locations. Several studies have confirmed that hidden service is vulnerable to flow correlation attacks, specifically, the attacker has the ability to synchronize the behavior of both sides of the communication after observing the flow for an extended period of time. However, since hidden service publish descriptor flow is transient behavioral traffic, automatically capturing and analyzing publish flow becomes a challenge. In this paper, our focus is the intelligent identification of the descriptor publishing flow. We propose a model for the descriptor publishing flow correlation attack (DPFCA). The model resolves the complex relationship between the circuit establishment flow and the publishing flow, and is able to intelligently process the sequence identification and content classification of the descriptor correlation flow of the existing version and tags. It is worth mentioning that the DPFCA is based on the automated homology comparison of the profile‐hidden Markov model (PHMM). The descriptor publishing flow is converted to an amino symbol sequence and then compare with the known homologous sequence group in the library of Profile. The experimental results show that our model can achieve higher performance in terms of accuracy and reliability of transient flow identification compared with the traditional flow correlation attack model.
By website fingerprinting (WF) technologies, local listeners are enabled to track the specific website visited by users through an investigation of the encrypted traffic between the users and the Tor network entry node. The current triplet fingerprinting (TF) technique proved the possibility of small sample WF attacks. Previous research methods only concentrate on extracting the overall features of website traffic while ignoring the importance of website local fingerprinting characteristics for small sample WF attacks. Thus, in the present paper, a deep nearest neighbor website fingerprinting (DNNF) attack technology is proposed. The deep local fingerprinting features of websites are extracted via the convolutional neural network (CNN), and then the k-nearest neighbor (k-NN) classifier is utilized to classify the prediction. When the website provides only 20 samples, the accuracy can reach 96.2%. We also found that the DNNF method acts well compared to the traditional methods in coping with transfer learning and concept drift problems. In comparison to the TF method, the classification accuracy of the proposed method is improved by 2%–5% and it is only dropped by 3% when classifying the data collected from the same website after two months. These experiments revealed that the DNNF is a more flexible, efficient, and robust website fingerprinting attack technology, and the local fingerprinting features of websites are particularly important for small sample WF attacks.
Website fingerprinting attacks allow attackers to determine the websites that users are linked to, by examining the encrypted traffic between the users and the anonymous network portals. Recent research demonstrated the feasibility of website fingerprinting attacks on Tor anonymous networks with only a few samples. Thus, this paper proposes a novel small-sample website fingerprinting attack method for SSH and Shadowsocks single-agent anonymity network systems, which focuses on analyzing homology relationships between website fingerprinting. Based on the latter, we design a Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) attack classification model that achieves 94.8% and 98.1% accuracy in classifying SSH and Shadowsocks anonymous encrypted traffic, respectively, when only 20 samples per site are available. We also highlight that the CNN-BiLSTM model has significantly better migration capabilities than traditional methods, achieving over 90% accuracy when applied on a new set of monitored sites with only five samples per site. Overall, our experiments demonstrate that CNN-BiLSTM is an efficient, flexible, and robust model for website fingerprinting attack classification.
It has been shown that website fingerprinting attacks are capable of destroying the anonymity of the communicator at the traffic level. This enables local attackers to infer the website contents of the encrypted traffic by using packet statistics. Previous researches on hidden service attacks tend to focus on active attacks; therefore, the reliability of attack conditions and validity of test results cannot be fully verified. Hence, it is necessary to reexamine hidden service attacks from the perspective of fingerprinting attacks. In this paper, we propose a novel Website Response Fingerprinting (WRFP) Attack based on response time feature and extremely randomized tree algorithm to analyze the hidden information of the response fingerprint. The objective is to monitor hidden service website pages, service types, and mounted servers. WRFP relies on the hidden service response fingerprinting dataset. In addition to simulated website mirroring, two different mounting modes are taken into account, the same-source server and multisource server. A total of 300,000 page instances within 30,000 domain sites are collected, and we comprehensively evaluate the classification performance of the proposed WRFP. Our results show that the TPR of webpages and server classification remain greater than 93% in the small-scale closed-world performance test, and it is capable of tolerating up to 10% fluctuations in response time. WRFP also provides a higher accuracy and computational efficiency than traditional website fingerprinting classifiers in the challenging open-world performance test. This also indicates the importance of response time feature. Our results also suggest that monitoring website types improves the judgment effect of the classifier on subpages.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.