In recent years, Online Social Networks (OSNs) have essentially become an integral part of our daily lives. There are hundreds of OSNs, each with its own focus and offers for particular services and functionalities. To take advantage of the full range of services and functionalities that OSNs offer, users often create several accounts on various OSNs using the same or different personal information. Retrieving all available data about an individual from several OSNs and merging it into one profile can be useful for many purposes. In this paper, we present a method for solving the Entity Resolution (ER), problem for matching user profiles across multiple OSNs. Our algorithm is able to match two user profiles from two different OSNs based on machine learning techniques, which uses features extracted from each one of the user profiles. Using supervised learning techniques and extracted features, we constructed different classifiers, which were then trained and used to rank the probability that two user profiles from two different OSNs belong to the same individual. These classifiers utilized 27 features of mainly three types: name based features (i.e., the Soundex value of two names), general user info based features (i.e., the cosine similarity between two user profiles), and social network topological based features (i.e., the number of mutual friends between two users' friends list). This experimental study uses real-life data collected from two popular OSNs, Facebook and Xing. The proposed algorithm was evaluated and its classification performance measured by AUC was 0.982 in identifying user profiles across two OSNs.
Online Social Networks (OSNs), such as Facebook and Twitter, have become an integral part of our daily lives. There are hundreds of OSNs, each with its own focus in that each offers particular services and functionalities. Recent studies show that many OSN users create several accounts on multiple OSNs using the same or different personal information. Collecting all the available data of an individual from several OSNs and fusing it into a single profile can be useful for many purposes. In this paper, we introduce novel machine learning based methods for solving Entity Resolution (ER), a problem for matching user profiles across multiple OSNs. The presented methods are able to match between two user profiles from two different OSNs based on supervised learning techniques, which use features extracted from each one of the user profiles. By using the extracted features and supervised learning techniques, we developed classifiers which can perform entity matching between two profiles for the following scenarios: (a) matching entities across two OSNs; (b) searching for a user by similar name; and (c) de-anonymizing a user's identity.The constructed classifiers were tested by using data collected from two popular OSNs, Facebook and Xing. We then evaluated the classifiers' performances using various evaluation measures, such as true and false positive rates, accuracy, and the Area Under the receiver operator Curve (AUC). The constructed classifiers were evaluated and their classification performance measured by AUC was quite remarkable, with an AUC of up to 0.982 and an accuracy of up to 95.9% in identifying user profiles across two OSNs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.