Relational clustering has attracted more and more attention due to its phenomenal impact in various important applications which involve multi-type interrelated data objects, such as Web mining , search marketing, bioinformatics, citation analysis, and epidemiology. In this paper, we propose a probabilistic model for relational clustering, which also provides a principal framework to unify various important clustering tasks including traditional attributes-based clustering, semi-supervised clustering, co-clustering and graph clustering. The proposed model seeks to identify cluster structures for each type of data objects and interaction patterns between different types of objects. Under this model, we propose parametric hard and soft relational clustering algorithms under a large number of exponential family distributions. The algorithms are applicable to relational data of various structures and at the same time unifies a number of stat-of-the-art clustering algorithms: co-clustering algorithms, the k-partite graph clustering, and semi-supervised clustering based on hidden Markov random fields.
Spectral clustering is a powerful tool for unsupervised data analysis. In this paper, we propose a context-aware hypergraph similarity measure (CAHSM), which leads to robust spectral clustering in the case of noisy data. We construct three types of hypergraph---the pairwise hypergraph, the k-nearest-neighbor (kNN) hypergraph, and the high-order over-clustering hypergraph. The pairwise hypergraph captures the pairwise similarity of data points; the kNN hypergraph captures the neighborhood of each point; and the clustering hypergraph encodes high-order contexts within the dataset. By combining the affinity information from these three hypergraphs, the CAHSM algorithm is able to explore the intrinsic topological information of the dataset. Therefore, data clustering using CAHSM tends to be more robust. Considering the intra-cluster compactness and the inter-cluster separability of vertices, we further design a discriminative hypergraph partitioning criterion (DHPC). Using both CAHSM and DHPC, a robust spectral clustering algorithm is developed. Theoretical analysis and experimental evaluation demonstrate the effectiveness and robustness of the proposed algorithm.Comment: 10 pages. Appearing in IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING: http://doi.ieeecomputersociety.org/10.1109/TKDE.2013.12
Nonnegative Matrix Factorization (NMF) has been applied with great success to many applications. As NMF is applied to massive datasets such as web-scale dyadic data, it is desirable to leverage a cluster of machines to speed up the factorization. However, it is challenging to efficiently implement NMF in a distributed environment. In this paper, we show that by leveraging a new form of update functions, we can perform local aggregation and fully explore parallelism. Moreover, under the new form of update functions, we can perform frequent updates, which aim to use the most recently updated data whenever possible. As a result, frequent updates are more efficient than their traditional concurrent counterparts. Through a series of experiments on a local cluster as well as the Amazon EC2 cloud, we demonstrate that our implementation with frequent updates is up to two orders of magnitude faster than the existing implementation with the traditional form of update functions.
Drug-drug interactions are one of the main concerns in drug discovery. Accurate prediction of drug-drug interactions plays a key role in increasing the efficiency of drug research and safety when multiple drugs are c o-prescribed. With various data sources that describe the relationships and properties between drugs, the comprehensive approach that integrates multiple data sources would be considerably effective in making high-accuracy prediction. In this paper, we propose a Deep Attention Neural Network based Drug-Drug Interaction prediction framework, abbreviated as DANN-DDI, to predict unobserved drug-drug interactions. First, we construct multiple drug feature networks and learn drug representations from these networks using the graph embedding method; then, we concatenate the learned drug embeddings and design an attention neural network to learn representations of drug-drug pairs; finally, we adopt a deep neural network to accurately predict drug-drug interactions. The experimental results demonstrate that our model DANN-DDI has improved prediction performance compared with state-of-the-art methods. Moreover, the proposed model can predict novel drug-drug interactions and drug-drug interaction-associated events.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.