Improving temporal knowledge graph embedding using tensor factorization

He, Peng; Zhou, Gang; Zhang, Mengli; Wei, Jianghong; Chen, Jing

doi:10.1007/s10489-021-03149-w

Cited by 17 publications

(21 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The PubChem corpus is used to fine-tune a LLM. 17,24 (2) The Barlow Twins neural network provides a learned representation of molecules in the context of bioassays. 15 It first independently encodes both molecular and textual information and then passes them through a unified projector.…”

Section: ■ Results and Discussionmentioning

confidence: 99%

Synergizing Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery

Schuh,

Boldini,

Sieber

2024

J. Chem. Inf. Model.

View full text Add to dashboard Cite

The precise prediction of molecular properties can greatly accelerate the development of new drugs. However, in silico molecular property prediction approaches have been limited so far to assays for which large amounts of data are available. In this study, we develop a new computational approach leveraging both the textual description of the assay of interest and the chemical structure of target compounds. By combining these two sources of information via selfsupervised learning, our tool can provide accurate predictions for assays where no measurements are available. Remarkably, our approach achieves state-of-the-art performance on the FS-Mol benchmark for zero-shot prediction, outperforming a wide variety of deep learning approaches. Additionally, we demonstrate how our tool can be used for tailoring screening libraries for the assay of interest, showing promising performance in a retrospective case study on a high-throughput screening campaign. By accelerating the early identification of active molecules in drug discovery and development, this method has the potential to streamline the identification of novel therapeutics.

show abstract

Section: ■ Results and Discussionmentioning

confidence: 99%

Synergizing Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery

Schuh,

Boldini,

Sieber

2024

J. Chem. Inf. Model.

View full text Add to dashboard Cite

show abstract

“…From the intuition of RESCAL, the researcher proposed various models such as SimplE (Kazemi and Poole, 2018) extends DistMult by separate embedding for associated entity pair ( h , t ) of a triplet with two separate diagonal matrices, dig ( M r ) and dig ( M r ′ ), to express the complex relation type. TNTSimplE (He et al , 2023) extends the work of SimplE to target temporal KG and to capture symmetry, asymmetry and inverse relation type.…”

Section: Related Workmentioning

confidence: 99%

“…From the intuition of RESCAL, the researcher proposed various models such as SimplE (Kazemi and Poole, 2018) extends DistMult by separate embedding for associated entity pair (h, t) of a triplet with two separate diagonal matrices, dig(M r ) and dig(M r 0 ), to express the complex relation type. TNTSimplE (He et al, 2023) 1. The use of KG enables a machine to learn and automate inferring over the cybersecurity domain, such as attacks prediction (Sun et al, 2022), threat prediction (Zhao et al, 2022) and threats analysis (Li et al, 2023).…”

Section: Ijwis 193/4mentioning

confidence: 99%

Infer the missing facts of D3FEND using knowledge graph representation learning

Khobragade

Ghumbre²,

Pachghare

2023

IJWIS

View full text Add to dashboard Cite

Purpose MITRE and the National Security Agency cooperatively developed and maintained a D3FEND knowledge graph (KG). It provides concepts as an entity from the cybersecurity countermeasure domain, such as dynamic, emulated and file analysis. Those entities are linked by applying relationships such as analyze, may_contains and encrypt. A fundamental challenge for collaborative designers is to encode knowledge and efficiently interrelate the cyber-domain facts generated daily. However, the designers manually update the graph contents with new or missing facts to enrich the knowledge. This paper aims to propose an automated approach to predict the missing facts using the link prediction task, leveraging embedding as representation learning. Design/methodology/approach D3FEND is available in the resource description framework (RDF) format. In the preprocessing step, the facts in RDF format converted to subject–predicate–object triplet format contain 5,967 entities and 98 relationship types. Progressive distance-based, bilinear and convolutional embedding models are applied to learn the embeddings of entities and relations. This study presents a link prediction task to infer missing facts using learned embeddings. Findings Experimental results show that the translational model performs well on high-rank results, whereas the bilinear model is superior in capturing the latent semantics of complex relationship types. However, the convolutional model outperforms 44% of the true facts and achieves a 3% improvement in results compared to other models. Research limitations/implications Despite the success of embedding models to enrich D3FEND using link prediction under the supervised learning setup, it has some limitations, such as not capturing diversity and hierarchies of relations. The average node degree of D3FEND KG is 16.85, with 12% of entities having a node degree less than 2, especially there are many entities or relations with few or no observed links. This results in sparsity and data imbalance, which affect the model performance even after increasing the embedding vector size. Moreover, KG embedding models consider existing entities and relations and may not incorporate external or contextual information such as textual descriptions, temporal dynamics or domain knowledge, which can enhance the link prediction performance. Practical implications Link prediction in the D3FEND KG can benefit cybersecurity countermeasure strategies in several ways, such as it can help to identify gaps or weaknesses in the existing defensive methods and suggest possible ways to improve or augment them; it can help to compare and contrast different defensive methods and understand their trade-offs and synergies; it can help to discover novel or emerging defensive methods by inferring new relations from existing data or external sources; and it can help to generate recommendations or guidance for selecting or deploying appropriate defensive methods based on the characteristics and objectives of the system or network. Originality/value The representation learning approach helps to reduce incompleteness using a link prediction that infers possible missing facts by using the existing entities and relations of D3FEND.

show abstract

“…• microsoft/deberta-v3-base (He et al 2022): Deep learning language model that uses the Transformer architecture and has been pre-trained on a large amount of text data. It focuses on understanding the syntactic and semantic structure of natural language and is designed for natural language processing tasks such as sentiment analysis, text classification, and text generation.…”

Section: Models Selectionmentioning

confidence: 99%

I2C-Huelva at SemEval-2023 Task 10: Ensembling Transformers Models for the Detection of Online Sexism

Felicia Fudulu,

Rodriguez Tenorio,

Pachón Álvarez

et al. 2023

Proceedings of the the 17th International Workshop on Semantic Evaluation (SemEval-2023)

View full text Add to dashboard Cite

This work details our approach for addressing Tasks A and B of the Semeval 2023 Task 10: Explainable Detection of Online Sexism (EDOS). For Task A a simple ensemble based of majority vote system was presented. To build our proposal, first a review of transformers was carried out and the 3 best performing models were selected to be part of the ensemble. Next, for these models, the best hyperpameters were searched using a reduced data set. Finally, we trained these models using more data. During the development phase, our ensemble system achieved an f1-score of 0.8403. For task B, we developed a model based on the deBERTa transformer, utilizing the hyperparameters identified for task A. During the development phase, our proposed model attained an f1-score of 0.6467. Overall, our methodology demonstrates an effective approach to the tasks, leveraging advanced machine learning techniques and hyperparameters searches to achieve high performance in detecting and classifying instances of sexism in online text.

show abstract

Improving temporal knowledge graph embedding using tensor factorization

Cited by 17 publications

References 21 publications

Synergizing Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery

Synergizing Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery

Infer the missing facts of D3FEND using knowledge graph representation learning

I2C-Huelva at SemEval-2023 Task 10: Ensembling Transformers Models for the Detection of Online Sexism

Contact Info

Product

Resources

About