The precision of the conventional user identification algorithm is not satisfactory because it ignores the role of user-generated data in identity matching. In this paper, we propose a frequent pattern mining-based cross-social network user identification algorithm that analyzes user-generated data in a personalized manner. We adopt the posterior probability-based information entropy weight allocation method that improves the precision rate and recall rate compared to the empirical weight allocation method. The extensive simulations are provided to demonstrate that the proposed algorithm can enhance the precision rate, recall rate, as well as the F-Measure (F1).INDEX TERMS User identification, frequent pattern, cross-social network, information entropy.
Social networking is an interactive Internet of Things. The symmetry of the network can reflect the similar friendships of users on different social networks. A user’s behavior habits are not easy to change, and users usually have the same or similar display names and published contents among multiple social networks. Therefore, the symmetry concept can be used to analyze the information generated by the user for user identification. User identification plays a key role in building better information about social network user profiles. As a consequence, it has very important practical significance in many network applications and has attracted a great deal of attention from researchers. However, existing works are primarily focused on rich network data and ignore the difficulty involved in data acquisition. Display names and user-published content are very easy to obtain compared to other types of user data across different social networks. Therefore, this paper proposes an across social networks user identification method based on user behavior habits (ANIUBH). We analyzed the user’s personalized naming habits in terms of display names, then utilized different similarity calculation methods to measure the similarity of the features contained in the display names. The variant entropy value was adopted to assign weights to the features mentioned above. In addition, we also measured and analyzed the user’s interest graph to further improve user identification performance. Finally, we combined one-to-one constraint with the Gale–Shapley algorithm to eliminate the one-to-many and many-to-many account-matching problems that often occur during the results-matching process. Experimental results demonstrated that our proposed method enables the possibility of user identification using only a small amount of online data.
With the rapid development of the Internet of Things (IoT) in 4G/5G deployments, the massive amount of network data generated by users has exploded, which has not only brought a revolution to human’s living, but also caused some malicious actors to utilize these data to attack the privacy of ordinary users. Therefore, it is crucial to identify the entity users behind multiple virtual accounts. Due to the low precision of user identification in the many-to-many mechanism of user identification, a random forest confirmation algorithm based on stable marriage matching (RFCA-SMM) is proposed in this study. It consists of three key steps: we first employ the stable marriage matching model to calculate the similarity between multiple users and utilize a scoring model to calculate the overall similarity of the users, after which candidate matching pairs are selected; second, we construct the random forest model that exploits a user similarity vector training set; afterward, the candidate matching pairs combine the secondary confirmation of the random forest model, which both improve the precision of the many-to-many user identification and protect private user data in the IoT. Extensive experiments are provided to demonstrate that the proposed algorithm improves precision rate, recall rate, and F-Measure (F1), as well as Area Under Curve (AUC).
With the popularization of the Internet and the arrival of the big data era, numerous different social networks (SNs) have emerged to satisfy users' social needs and offer them rich content and convenient services. Under these circumstances, identifying multiple social accounts belonging to the same user across different SNs is of great importance for many applications. Across social networks user identification (ASNUI) can help perfect user information, offer personalized service recommendation, and data mining, as well as provide support for scientific research. This paper first systematically introduces the application of ASNUI in the field of social computing, then states its applications and challenges, and reviews the adopted models, frameworks, and performance comparison state-of-the-art techniques used in ASNUI. Finally, we also identify a few future research directions in ASNUI, such as weight allocation of user attribute information, the fusion of multi-dimensional information, and large-scale user identification.INDEX TERMS Big data, across social networks, user identification, entity user.
Identifying offline entities corresponding to multiple virtual accounts of users across social networks is crucial for the development of related fields, such as user recommendation system, network security, and user behavior pattern analysis. The data generated by users on multiple social networks has similarities. Thus, the concept of symmetry can be used to analyze user-generated information for user identification. In this paper, we propose a friendship networks-based user identification across social networks algorithm (FNUI), which performs the similarity of multi-hop neighbor nodes of a user to characterize the information redundancy in the friend networks fully. Subsequently, a gradient descent algorithm is used to optimize the contribution of the user’s multi-hop nodes in the user identification process. Ultimately, user identification is achieved in conjunction with the Gale–Shapley matching algorithm. Experimental results show that compared with baselines, such as friend relationship-based user identification (FRUI) and friendship learning-based user identification (FBI): (1) The contribution of single-hop neighbor nodes in the user identification process is higher than other multi-hop neighbor nodes; (2) The redundancy of information contained in multi-hop neighbor nodes has a more significant impact on user identification; (3) The precision rate, recall rate, comprehensive evaluation index (F1), and area under curve (AUC) of user identification have been improved.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.