Many large-scale knowledge bases simultaneously represent two views of knowledge graphs (KGs): an ontology view for abstract and commonsense concepts, and an instance view for specific entities that are instantiated from ontological concepts. Existing KG embedding models, however, merely focus on representing one of the two views alone. In this paper, we propose a novel two-view KG embedding model, JOIE, with the goal to produce better knowledge embedding and enable new applications that rely on multi-view knowledge. JOIE employs both cross-view and intra-view modeling that learn on multiple facets of the knowledge base. The cross-view association model is learned to bridge the embeddings of ontological concepts and their corresponding instance-view entities. The intra-view models are trained to capture the structured knowledge of instance and ontology views in separate embedding spaces, with a hierarchy-aware encoding technique enabled for ontologies with hierarchies. We explore multiple representation techniques for the two model components and investigate with nine variants of JOIE. Our model is trained on large-scale knowledge bases that consist of massive instances and their corresponding ontological concepts connected via a (small) set of cross-view links. Experimental results on public datasets show that the best variant of JOIE significantly outperforms previous models on instance-view triple prediction task as well as ontology population on ontologyview KG. In addition, our model successfully extends the use of KG embeddings to entity typing with promising performance. CCS CONCEPTS• Computing methodologies → Knowledge representation and reasoning; Semantic networks; Ontology engineering.
As part of the PhysioNet / Computing in Cardiology Challenge 2016, this work focuses on automatic classification of normal / abnormal phonocardiogram (PCG) recording, with the aim of quickly identifying subjects that need further expert diagnosis. To improve the robustness of the classifiers by increasing the number of training samples, the recordings were windowed into 5 second segments and our classifiers were trained to classify these segments. Overall recording classification was then generated using a voting scheme from classification results of its segments. Our features include spectrograms and Melfrequency cepstrum coefficients. Our best submission result during the official phase (evaluated on a random 20% of the hidden test set) has a score of 0.813, with 0.735 sensitivity and 0.892 specificity. Two more submissions are still being evaluated.
Complementary product recommendation (CPR), aiming at providing product suggestions that are often bought together to serve a joint demand, forms a pivotal component of e-commerce service, however, existing methods are far from optimal. Given one product, how to recommend its complementary products of different types is the key problem we tackle in this work. We first conduct an analysis to correct the inaccurate assumptions adopted by existing work to show that co-purchased products are not always complementary and further propose a new strategy to generate clean distant supervision labels for CPR modeling. Moreover, to bridge in the gap from existing work that CPR does not only need relevance modeling but also requires diversity to fulfill the whole purchase demand, we develop a deep learning framework, P-Companion, to explicitly model both relevance and diversity. More specifically, given one product with its product type, P-Companion first uses an encoderdecoder network to predict multiple complementary product types, and then a transfer metric learning network is developed to project the embedding of query product to each predicted complementary product type subspace and further learn the complementary relationship based on the distant supervision labels. The whole framework can be trained from end-to-end and is robust to coldstart products attributed to a novel pretrained product embedding module named Product2vec, based on graph attention networks. Extensive offline experiments show that P-Companion outperforms state-of-the-art baselines by 7.1% increase on the Hit@10 score with well-controlled diversity. Production-wise, we deploy P-Companion to provide online recommendations for over 200M products at Amazon and observe significant gains on product sales and profit. CCS CONCEPTS • Information systems → Recommender systems; Online advertising; Online shopping; • Computing methodologies → Knowledge representation and reasoning.
Motivation Circular RNA (circRNA) is a novel class of long non-coding RNAs that have been broadly discovered in the eukaryotic transcriptome. The circular structure arises from a non-canonical splicing process, where the donor site backspliced to an upstream acceptor site. These circRNA sequences are conserved across species. More importantly, rising evidence suggests their vital roles in gene regulation and association with diseases. As the fundamental effort toward elucidating their functions and mechanisms, several computational methods have been proposed to predict the circular structure from the primary sequence. Recently, advanced computational methods leverage deep learning to capture the relevant patterns from RNA sequences and model their interactions to facilitate the prediction. However, these methods fail to fully explore positional information of splice junctions and their deep interaction. Results We present a robust end-to-end framework, Junction Encoder with Deep Interaction (JEDI), for circRNA prediction using only nucleotide sequences. JEDI first leverages the attention mechanism to encode each junction site based on deep bidirectional recurrent neural networks and then presents the novel cross-attention layer to model deep interaction among these sites for backsplicing. Finally, JEDI can not only predict circRNAs but also interpret relationships among splice sites to discover backsplicing hotspots within a gene region. Experiments demonstrate JEDI significantly outperforms state-of-the-art approaches in circRNA prediction on both isoform level and gene level. Moreover, JEDI also shows promising results on zero-shot backsplicing discovery, where none of the existing approaches can achieve. Availability and implementation The implementation of our framework is available at https://github.com/hallogameboy/JEDI. Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.