In this paper, We present how we combined visualization and machine learning techniques to provide an analytic tool for web log data.We designed a visualization where advertisers can observe the visits to their different pages on a site, common web analytic measures and individual user navigation on the site. In this visualization, the users can get insights of the data by looking at key elements of the graph. Additionally, we applied pattern mining techniques to observe common trends in user segments of interest.
Reconstruction‐based one‐class classification has shown to be very effective in a number of domains. This approach works by attempting to capture the underlying structure of the normal class, typically, by means of clusters of objects. It has the main disadvantage, however, that one has to indicate the number of clusters in advance, for this yields an efficient way of computing a clustering. In this paper, we introduce a new algorithm, OCKRA++, which achieves a better performance, by enhancing a clustering‐based one‐class ensemble classifier (OCKRA) with a cluster validity index that is used to set the best number of clusters during the classifier's training process. We have thoroughly tested OCKRA++ in a particular domain, namely masquerade detection. For this purpose, we have used the Windows‐Users and ‐Intruder simulation Logs data set repository, which contains 70 different masquerade data sets. We have found that OCKRA++ is currently the algorithm that achieves the best area under the curve, with a significant difference, in masquerade detection using the file system navigation approach.
Latent fingerprint identification is one of the leading forensic activities to clarify criminal acts. However, its computational cost hinders the rapid decision making in the identification of an individual when large databases are involved. To reduce the search time used to generate the fingerprint candidates' order to be compared, fingerprint indexing algorithms that reduce the search space while minimizing the increase in the error rate (compared to the identification) are developed. In the present research, we propose an algorithm for indexing latent fingerprints based on minutia cylinder codes (MCC) . This type of minutiae descriptor presents a fixed structure, which brings advantages in terms of efficiency. Besides, in recent studies, this descriptor has shown an identification error rate, at the local level, lower than the other descriptors reported in the literature. Our indexing proposal requires an initial step to construct the indices, in which it uses k-means++ clustering algorithm to create groups of similar minutia cylinder codes corresponding to the impressions of a set of databases. K-means++ allows for a better outcome over other clustering algorithms because of the selection of the proper centroids. The buckets associated with each index are populated with the background databases. Then, given a latent fingerprint, the algorithm extracts the minutia cylinder codes associated with the clusters' indices with the lowest distance respect to each descriptor of this latent fingerprint. Finally, it integrates the votes represented by the fingerprints obtained to select the candidate impressions. We conduct a set of experiments in which our proposal outperforms current rival algorithms in presence of different databases and descriptors. Also, the primary experiment reduces the search space by four orders of magnitude when the background database contains more than one million impressions.INDEX TERMS Fingerprint indexing, Latent fingerprint, K-means Clustering, Minutia Cylinder Code.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.