Research on knowledge representation learning method of diet knowledge graph

Liu, Zhicong; Su, Long; Xu, Zifeng; Wang, Longjuan

doi:10.1109/aie57029.2022.00128

Cited by 5 publications

(9 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Table 1 shows the statistics of these datasets. Furthermore, we compare our Mix-Key method with a vanilla neural network, three topology-based augmentation methods (PermE [ 37 ], MaskN [ 38 ] and NodeSam [ 29 ]), two mixup-based augmentation methods (MixupGraph [ 27 ] and Graph Transplant [ 30 ]) and four graph contrastive learning methods (AutoGCL [ 39 ], GraphMVP [ 40 ], MolCLR [ 9 ] and KANO [ 41 ]).…”

Section: Methodsmentioning

confidence: 99%

Mix-Key: graph mixup with key structures for molecular property prediction

Jiang,

Wang,

et al. 2024

Briefings in Bioinformatics

View full text Add to dashboard Cite

Molecular property prediction faces the challenge of limited labeled data as it necessitates a series of specialized experiments to annotate target molecules. Data augmentation techniques can effectively address the issue of data scarcity. In recent years, Mixup has achieved significant success in traditional domains such as image processing. However, its application in molecular property prediction is relatively limited due to the irregular, non-Euclidean nature of graphs and the fact that minor variations in molecular structures can lead to alterations in their properties. To address these challenges, we propose a novel data augmentation method called Mix-Key tailored for molecular property prediction. Mix-Key aims to capture crucial features of molecular graphs, focusing separately on the molecular scaffolds and functional groups. By generating isomers that are relatively invariant to the scaffolds or functional groups, we effectively preserve the core information of molecules. Additionally, to capture interactive information between the scaffolds and functional groups while ensuring correlation between the original and augmented graphs, we introduce molecular fingerprint similarity and node similarity. Through these steps, Mix-Key determines the mixup ratio between the original graph and two isomers, thus generating more informative augmented molecular graphs. We extensively validate our approach on molecular datasets of different scales with several Graph Neural Network architectures. The results demonstrate that Mix-Key consistently outperforms other data augmentation methods in enhancing molecular property prediction on several datasets.

show abstract

Section: Methodsmentioning

confidence: 99%

Mix-Key: graph mixup with key structures for molecular property prediction

Jiang,

Wang,

et al. 2024

Briefings in Bioinformatics

View full text Add to dashboard Cite

show abstract

“…Lack of canonical node features Traditional network biology studies solely rely on biological network structures to gain insights [7,6,49]. Meanwhile, the presence of rich node features in many existing graph benchmarks is crucial for the success of GNNs [55], posing a challenge for GNNs in learning without meaningful node features. An exciting and promising future direction for obtaining meaningful node features is by leveraging the sequential or structural information of the gene product (e.g., protein) using large-scale biological pre-trained language models like ESM-2 [53].…”

Section: Challenges For the Obnb Benchmarks And Potential Future Dire...mentioning

confidence: 99%

“…OneHotLogDeg (short for LogDeg) first computes the log degree of each node in the graph and then uniformly bins the nodes into one of 32 bins based on their log degree. The one-hot encoded node degrees approach has recently been shown to be a great structure encoder, whose utilization can sometimes result in performance superior to using the original node features associated with the graph [17,55]. Meanwhile, the design choice of using log-uniform grids stems from the scale-free nature of biological networks [2].…”

Section: A21 Node Feature Designmentioning

confidence: 99%

Open Biomedical Network Benchmark A Python Toolkit for Benchmarking Datasets with Biomedical Networks

Liu

Krishnan

2023

Preprint

View full text Add to dashboard Cite

Recent breakthroughs in graph representation learning (GRL) methods have vastly advanced many real-world applications where graph structures naturally arise, such as drug discoveries, molecular property predictions, and social recommendation systems. The rapid developments of GRL with applications in specific scientific domains are largely enabled by graph learning libraries such as PyTorch Geometric, which provide infrastructures that allow domain scientists to share domain-specific benchmarking datasets, and GRL researchers to adapt and design specialized GRL methods for these particular tasks. Meanwhile, over the past two decades, network biology has demonstrated superior value in harvesting biological insights from network data, such as identifying genes' associated functions, diseases, and traits. Nevertheless, the GRL community faces a significant barrier in working with biological networks because of the tedious and specialized (pre-)processing steps required to set up machine learning (ML)-ready benchmarking datasets. Here, we present nleval, a Python package containing reusable modules that enable researchers to effortlessly set up PyG-compatible ML-ready datasets using data downloaded from public databases. We expect nleval to help network biologists set up custom benchmarking datasets for answering specific biological questions of their interests and help GRL researchers adapt these datasets for designing new specialized GRL architectures.

show abstract

“…The activity and property of a drug are closely related to the structure of the drug molecule. Nevertheless, most current self-supervised models do not use 3D information or use it partially ( Liu et al 2022a , Stärk et al 2022 ). We introduce a novel 3D–3D view contrastive learning method to learn molecular structural-semantic.…”

Section: Introductionmentioning

confidence: 99%

3D graph contrastive learning for molecular property prediction

Moon

Kwon

2023

Bioinformatics

View full text Add to dashboard Cite

Motivation Self-supervised learning (SSL) is a method that learns the data representation by utilizing supervision inherent in the data. This learning method is in the spotlight in the drug field, lacking annotated data due to time-consuming and expensive experiments. SSL using enormous unlabeled data has shown excellent performance for molecular property prediction, but a few issues exist. (1) Existing SSL models are large-scale; there is a limitation to implementing SSL where the computing resource is insufficient. (2) In most cases, they do not utilize 3D structural information for molecular representation learning. The activity of a drug is closely related to the structure of the drug molecule. Nevertheless, most current models do not use 3D information or use it partially. (3) Previous models that apply contrastive learning to molecules use the augmentation of permuting atoms and bonds. Therefore, molecules having different characteristics can be in the same positive samples. We propose a novel contrastive learning framework, small-scale 3D Graph Contrastive Learning (3DGCL) for molecular property prediction, to solve the above problems. Results 3DGCL learns the molecular representation by reflecting the molecule’s structure through the pre-training process that does not change the semantics of the drug. Using only 1,128 samples for pre-train data and 0.5 million model parameters, we achieved state-of-the-art or comparable performance in six benchmark datasets. Extensive experiments demonstrate that 3D structural information based on chemical knowledge is essential to molecular representation learning for property prediction. Availability Data and codes are available in https://github.com/moonkisung/3DGCL.

show abstract

Research on knowledge representation learning method of diet knowledge graph

Cited by 5 publications

References 13 publications

Mix-Key: graph mixup with key structures for molecular property prediction

Mix-Key: graph mixup with key structures for molecular property prediction

Open Biomedical Network Benchmark A Python Toolkit for Benchmarking Datasets with Biomedical Networks

3D graph contrastive learning for molecular property prediction

Contact Info

Product

Resources

About