The detection of unauthorized users can be problematic for techniques that are available at present if the nefarious actors are using identity hiding tools such as anonymising proxies or virtual private networks (VPNs). This work presents computational models to address the limitations currently experienced in detecting VPN traffic. A model to detect usage of VPNs was developed using a multi-layered perceptron neural network that was trained using flow statistics data found in the transmission control protocol (TCP) header of captured network packets. Validation testing showed that the presented models are capable of classifying network traffic in a binary manner as direct (originating directly from a user's own device) or indirect (makes use of identity and location hiding features of VPNs) with high degrees of accuracy. The experiments conducted to classify OpenVPN usage found that the neural network was able to correctly identify the VPN traffic with an overall accuracy of 93.71%. The further work done to classify Stunnel OpenVPN usage found that the Neural Network was able to correctly identify VPN traffic with an overall accuracy of 97.82% accuracy when using 10-fold cross validation. This final experiment also provided an observation of 3 different validation techniques and the different accuracy results obtained. These results demonstrate a significant advancement in the detection of unauthorised user access with evidence showing that there could be further advances for research in this field particularly in the application of business security where the detection of VPN usage is important to an organization.
There is an increasing need to be able to classify whether an incoming packet is from a legitimate originating IP address or has been modified through an intermediate proxy or node. Being able to verify the originating IP address allows a business (e.g. bank) to use geolocation services in order to then ascertain which geographical location that packet was sent from. This can then feed into the system intrusion system or backend fraud alert mechanisms. The web however is going 'dark'. There is a noticeable uptake in the amount of encrypted data and third party anonymous traffic proxies which aim to mask the try location and IP address of a web request. We present here a system which identifies the characteristics or signatures whenever a user is using a web proxy by developing a Detection System that records packets and analyses them looking for identifying patterns of web proxies.
The emergence and growth of cloud computing has made a serious impact on the IT industry in recent years with large companies starting to offer powerful, reliable and costefficient platforms for businesses to build and reshape their business models. Showing no sign of slowing down, cloud computing capabilities now include machine learning, with facilities for both designing and deploying models. With this capability of machine learning using cloud computing comes the increasing need to be able to classify whether an incoming connection is from a legitimate originating IP address or if it is being sent through an intermediary like a web proxy. Taking inspiration from Intrusion Detection Systems that make use of machine learning capabilities to improve anomaly detection accuracy, this paper proposes that cloud based machine learning can be used in order to detect and classify web proxy usage by capturing packet data and feeding it into a cloud based machine learning web service.
Many businesses and educational facilities employ some form of filtering to control what internet sites their users may browse. This is done to help protect network assets, to protect data from being stolen and to comply with company policies on internet usage. Anonymous proxies (or web proxies) can be used by the end users to bypass most filtering systems put in place by businesses and this can remove the protection that the filtering systems provide for the network. For instance, unless the web proxy being used is being hosted by the end user or someone they know, then the identity of whoever is hosting the proxy is unknown and they potentially cannot be trusted. The proxy website could also have been set up to eavesdrop on the data flow between the end user's machine and the internet. Sites set up to do this would normally log information for later inspection and data sent from a business user's machine could contain potentially confidential information about the company or the user themselves. This research aims to identify the characteristics or signatures whenever a user is using a web proxy by developing a Detection System that records packets and analyses them looking for identifying patterns of web proxies. One of the main focuses of the research will be in detecting the usage of proxy websites that make use of SSL to encrypt the contents of their packets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.