Graph machine learning has gained great attention in both academia and industry recently. Most of the graph machine learning models, such as Graph Neural Networks (GNNs), are trained over massive graph data. However, in many realworld scenarios, such as hospitalization prediction in healthcare systems, the graph data is usually stored at multiple data owners and cannot be directly accessed by any other parties due to privacy concerns and regulation restrictions. Federated Graph Machine Learning (FGML) is a promising solution to tackle this challenge by training graph machine learning models in a federated manner. In this survey, we conduct a comprehensive review of the literature in FGML. Specifically, we first provide a new taxonomy to divide the existing problems in FGML into two settings, namely, FL with structured data and structured FL. Then, we review the mainstream techniques in each setting and elaborate on how they address the challenges under FGML. In addition, we summarize the real-world applications of FGML from different domains and introduce open graph datasets and platforms adopted in FGML. Finally, we present several limitations in the existing studies with promising research directions in this field.
Graph convolutional networks (GCNs) have been widely adopted for graph representation learning and achieved impressive performance. For larger graphs stored separately on different clients, distributed GCN training algorithms were proposed to improve efficiency and scalability. However, existing methods directly exchange node features between different clients, which results in data privacy leakage. Federated learning was incorporated in graph learning to tackle data privacy, while they suffer from severe performance drop due to non-iid data distribution. Besides, these approaches generally involve heavy communication and memory overhead during the training process. In light of these problems, we propose a Privacy-Preserving Subgraph sampling based distributed GCN training method (PPSGCN), which preserves data privacy and significantly cuts back on communication and memory overhead. Specifically, PPSGCN employs a star-topology client-server system. We firstly sample a local node subset in each client to form a global subgraph, which greatly reduces communication and memory costs. We then conduct local computation on each client with features or gradients of the sampled nodes. Finally, all clients securely communicate with the central server with homomorphic encryption to combine local results while preserving data privacy. Compared with federated graph learning methods, our PPSGCN model is trained on a global graph to avoid the negative impact of local data distribution. We prove that our PPSGCN algorithm would converge to a local optimum with probability 1. Experiment results on three prevalent benchmarks demonstrate that our algorithm significantly reduces communication and memory overhead while maintaining desirable performance. Further studies not only demonstrate the fast convergence of PPSGCN, but discuss the trade-off between communication and local computation cost as well.
The exploration of unconventional oil and gas, especially the exploration process of tight oil, is closely related to the evolution of tight reservoirs and the accumulation process. In order to investigate the densification and accumulation process of the Fuyu tight oil reservoir in the Sanzhao depression, Songliao Basin, through the new understanding of reservoir petrological characteristics, diagenesis and diagenetic sequence are combined with a large number of inclusions: temperature measurement, spectral energy measurement, and single-well burial history analysis, and then contrastive analysis with current reservoir conditions. The results prove that diagenesis is dominated by compaction and cementation, and the restoration of paleoporosity shows that its porosity reduction rate reached 67% and the densification process started in the early Nenjiang Formation and was finalized at the end of the Nenjiang Formation. The accumulation of the Fuyu oil layer generally has the characteristics of two stages and multiple episodes, and the main accumulation period is the end of the Mingshui Formation. The end of the Nenjiang Formation, where the main body of the reservoir is densified, is just a prelude to the massive expulsion of hydrocarbons in the Songliao Basin, which makes the Fuyu oil layer have the characteristics of first compacting and then accumulating. Through the above analysis, it can be seen that the accumulation of oil and gas in the Fuyu oil layer, Sanzhao depression, is more dependent on the fault-dominated transport system. In addition, it is believed that tight oil accumulation should have the characteristics of short-distance oil enrichment around the fault, and the development area of fracture deserts near the fault sand body should be the key area for further exploration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.