The Internet as the whole is a network of multiple computer networks and their massive infrastructure. The web is made up of accessible websites through search engines such as Google, Firefox, etc. and it is known as the Surface Web. The Internet is segmented further in the Deep Web-the content that it is not indexed and cannot access by traditional search engines. Dark Web considers a segment of the Deep Web. It accesses through TOR. Actors within Dark Web websites are anonymous and hidden. Anonymity, privacy and the possibility of non-detection are three factors that are provided by special browser such as TOR and I2P. In this paper, we are going to discuss and provide results about the influence of the Dark Web in different spheres of society. It is given the number of daily anonymous users of the Dark Web (using TOR) in Kosovo as well as in the whole world for a period of time. The influence of hidden services websites is shown and results are gathered from Ahimia and Onion City Dark Web's search engines. The anonymity is not completely verified on the Dark Web. TOR dedicates to it and has intended to provide anonymous activities. Here are given results about reporting the number of users and in which place(s) they are. The calculation is based on IP addresses according to country codes from where comes the access to them and report numbers in aggregate form. In this way, indirect are represented the Dark Web users. The number of users in anonymous networks on the Dark Web is another key element that is resulted. In such networks, users are calculated through the client requests of directories (by TOR metrics) and the relay list is updated. Indirectly, the number of users is calculated for the anonymous networks.
This paper analyses the impact of current trend in applying machine learning in detection of vandalism, with the specific aim of analyzing the impact of the class imbalance in Wikipedia articles. The class imbalance problem has the effect that almost all the examples are labelled as one class (legitimate editing); while far fewer examples are labelled as the other class, usually the more important class (vandalism). The obtained results show that resampling strategies: Random Under Sampling (RUS) and Synthetic Minority Oversampling TEchnique (SMOTE) have a partial effect on the improvement of the classification performance of all tested classifiers, excluding Random Forest, on both tested languages (simple English and Albanian) of the Wikipedia. The results from experimentation extended on two different languages show that they are comparable to the existing work.
In this paper, we evaluate a list of classifiers in order to use them in the detection of vandalism by focusing on metadata features. Our work is focused on two low resource data sets (Simple English and Albanian) from Wikipedia. The aim of this research is to prove that this form of vandalism detection applied in one data set (language) can be extended into another data set (language). Article views data sets in Wikipedia have been used rarely for the purpose of detecting vandalism. We will show the benefits of using article views data set with features from the article revisions data set with the aim of improving the detection of vandalism.The key advantage of using metadata features is that these metadata features are language independent and simple to extract because they require minimal processing. This paper shows that application of vandalism models across low resource languages is possible, and vandalism can be detected through view patterns of articles.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.