Meta-path-based link prediction in schema-rich heterogeneous information network

Cao, Xiaohuan; Zheng, Yuyan; Shi, Chuan; Li, Jingzhi; Wu, Bin

doi:10.1007/s41060-017-0046-1

Cited by 21 publications

(6 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One of the drawbacks of these algorithms is that they require manual predefinition and enumeration of meta-paths. This may be not feasible for schema-rich HMLN or the relations that involve multiple hopping paths (Cao et al, 2017), e.g. relations inferred through thousands of similar chemicals.…”

Section: Meta-path-based Algorithmsmentioning

confidence: 99%

Heterogeneous Multi-Layered Network Model for Omics Data Integration and Analysis

et al. 2020

View full text Add to dashboard Cite

Advances in next-generation sequencing and high-throughput techniques have enabled the generation of vast amounts of diverse omics data. These big data provide an unprecedented opportunity in biology, but impose great challenges in data integration, data mining, and knowledge discovery due to the complexity, heterogeneity, dynamics, uncertainty, and high-dimensionality inherited in the omics data. Network has been widely used to represent relations between entities in biological system, such as protein-protein interaction, gene regulation, and brain connectivity (i.e. network construction) as well as to infer novel relations given a reconstructed network (aka link prediction). Particularly, heterogeneous multi-layered network (HMLN) has proven successful in integrating diverse biological data for the representation of the hierarchy of biological system. The HMLN provides unparalleled opportunities but imposes new computational challenges on establishing causal genotype-phenotype associations and understanding environmental impact on organisms. In this review, we focus on the recent advances in developing novel computational methods for the inference of novel biological relations from the HMLN. We first discuss the properties of biological HMLN. Then we survey four categories of state-ofthe-art methods (matrix factorization, random walk, knowledge graph, and deep learning). Thirdly, we demonstrate their applications to omics data integration and analysis. Finally, we outline strategies for future directions in the development of new HMLN models.

show abstract

Section: Meta-path-based Algorithmsmentioning

confidence: 99%

Heterogeneous Multi-Layered Network Model for Omics Data Integration and Analysis

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Some researchers study how to extract properties from HINs and then feed them to a simple binary classifier [27]. For example, Cao et al [7] designed a framework to automatically extract meta-paths from schema-rich HINs. The work in [6,20,34] aim to predict multityped links in HINs, which is different from our method.…”

Section: Related Workmentioning

confidence: 99%

On relationship formation in heterogeneous information networks: An inferring method based on multilabel learning

Chen

et al. 2019

Statistical Analysis

View full text Add to dashboard Cite

This paper studies how relationships form in heterogeneous information networks (HINs). The objective is not only to predict relationships in a given HIN more accurately but also to discover the interdependency between different type of relationships. A new relationship prediction method MULRP based on multilabel learning (MLL in brief) is proposed. In MULRP, the types of relationship between two nodes are represented by the meta-paths between nodes and each type of relationship is given a label. Under the framework of MLL, any potential relationships including the target relationship can be predicted. Moreover, the method can output the reasonable dependency scores between relationships. Thus, more viable paths will be provided to facilitate the formation of new relationships. The proposed method is evaluated on two real datasets: The DBLP Computer Science Bibliography(abbr. DBLP) network and Twitter network. The experimental results show that by using heterogeneous information in a supervised MLL setting, MULRP achieves better performance in comparison to several baseline binary classification methods and a state-of-art relationship prediction method. KEYWORDSheterogeneous information networks, meta-path, multilabel learning, relationship prediction INTRODUCTIONMany complex systems in real world can be formalized as networks, where nodes represent objects and links represent interactions between objects [15]. Most of these networks are heterogeneous, which contain various type of objects and relations. For example, in the online social network (OSN) Twitter, there are different types of nodes like users, locations and tweets, and different types of links like write/written, follow/followed, check-in/checked-in, etc. As a key subtask in link mining and social network analysis, link prediction aims to predict the formation of links in future based on the current or historical network [14]. It has wide application in bibliographic networks, biological networks, OSNs, recommendation systems and so on. Link prediction can be regarded as a simple binary classification problem: For any two unconnected objects, predict whether the link exists (with a positive label) or not (with a negative label). The prediction methods can be based on structural properties of the network [22] or the attributes of nodes.Many of the previous link prediction methods are designed for homogeneous information networks where all nodes or links are of the same type. These networks are usually the simplification of real interacting systems by ignoring its heterogeneity. For example, the co-authorship network only contains the author object and the co-author relationship. It is actually derived from a bibliographic network like DBLP,

show abstract

“…It is pointed out that, the output of TPathMine not only contains the classification result but also the different weight for the selected meta-path, can be used in many data mining tasks [17,18].…”

Section: The Tpathmine Modelmentioning

confidence: 99%

Mobile APP User Attribute Prediction by Heterogeneous Information Network Modeling

Zhang

Gong

Teng

et al. 2019

Communications in Computer and Information Science

View full text Add to dashboard Cite

User-based attribute information, such as age and gender, is usually considered as user privacy information. It is difficult for enterprises to obtain user-based privacy attribute information. However, user-based privacy attribute information has a wide range of applications in personalized services, user behavior analysis and other aspects. Although many scholars have made achievements in user attribute prediction and other related fields, there are still two main problems that impede further improvement on the accuracy of classification: (1) Traditional machine learning classification merely takes each object as a single individual, ignoring the relationship between them; (2) At present, the popular Heterogeneous Path-Mine Information Network only considers whether the user has a relationship with the attributes of other nodes, rather than the degree of correlation of the attributes. It employs a linear regression model to fit the weight of meta-path, which is highly sensitive to outliers. To solve the above two problems, this paper advances the HetPathMine model and puts forward TPathMine model. With applying the number of clicks of attributes under each node to express the user's emotional preference information, optimizations of the solution of meta-path weight are also presented. Based on meta-path in heterogeneous information networks, the new model integrates all relationships among objects into isomorphic relationships of classified objects. Matrix is used to realize the knowledge dissemination of category knowledge among isomorphic objects. The experimental results show that: (1) the prediction of user attributes based on heterogeneous information networks can achieve higher accuracy than traditional machine learning classification methods; (2) TPath-Mine model based on the number of clicks is more accurate in classifying users of different age groups, and the weight of each meta-path is consistent with human intuition or the real world situation.

show abstract

Meta-path-based link prediction in schema-rich heterogeneous information network

Cited by 21 publications

References 34 publications

Heterogeneous Multi-Layered Network Model for Omics Data Integration and Analysis

Heterogeneous Multi-Layered Network Model for Omics Data Integration and Analysis

On relationship formation in heterogeneous information networks: An inferring method based on multilabel learning

Mobile APP User Attribute Prediction by Heterogeneous Information Network Modeling

Contact Info

Product

Resources

About