The extent to which genetic interactions affect observed phenotypes is generally unknown because current interaction detection approaches only consider simple interactions between top SNPs of genes. We introduce an open-source framework for increasing the power of interaction detection by considering all SNPs within a selected set of genes and complex interactions between them, beyond only the currently considered multiplicative relationships. In brief, the relation between SNPs and a phenotype is captured by a neural network, and the interactions are quantified by Shapley scores between hidden nodes, which are gene representations that optimally combine information from the corresponding SNPs. Additionally, we design a permutation procedure tailored for neural networks to assess the significance of interactions, which outperformed existing alternatives on simulated datasets with complex interactions, and in a cholesterol study on the UK Biobank it detected nine interactions which replicated on an independent FINRISK dataset.
We do not know the extent to which genetic interactions affect the observed phenotype in diseases, because the current interaction detection approaches are limited: They only consider the interaction between the top SNPs of each gene, and only consider simple interactions. We introduce methods for increasing the statistical power of interaction detection by taking into account all SNPs and complex interactions between them, beyond only the currently considered multiplicative relationships. In brief, the relation between SNPs and a phenotype is captured by a gene interaction neural network (NN), and the interactions are quantified by the Shapley score between hidden nodes, which are gene representations that optimally combine information from all SNPs in the gene. Additionally, we design a new permutation procedure tailored for NNs to assess the significance of interactions. The new approach outperforms existing alternatives on simulated datasets, and on cholesterol studies in the UK Biobank detected six interactions which replicated on an independent FINRISK dataset, four of them novel findings.
Abstract. Satellite-based aerosol retrievals provide global spatially distributed estimates of atmospheric aerosol parameters that are commonly needed in applications such as estimation of atmospherically corrected satellite data products, climate modelling and air quality monitoring. However, a common feature of the conventional satellite aerosol retrievals is that they have reasonably low spatial resolution and poor accuracy caused by uncertainty in auxiliary model parameters, such as fixed aerosol model parameters, and the approximate forward radiative transfer models utilized to keep the computational complexity feasible. As a result, the improvement and reprocessing of the operational satellite data retrieval algorithms would become a tedious and computationally excessive problem. To overcome these problems, we have developed a machine-learning-based post-process correction approach to correct the existing operational satellite aerosol data products. Our approach combines the existing satellite retrieval data and a post-processing step where a machine learning algorithm is utilized to predict the approximation error in the conventional retrieval. With approximation error, we refer to the discrepancy between the true aerosol parameters and the ones retrieved using the satellite data. Our hypothesis is that the prediction of the approximation error with a finite training dataset is a less complex and easier task than the direct, fully learned machine-learning-based prediction in which the aerosol parameters are directly predicted given the satellite observations and measurement geometry. Our approach does not require reprocessing of the satellite retrieval products; it requires only a computationally fast machine-learning-based post-processing step of the existing retrieval product. Our approach is based on neural networks trained based on collocated satellite data and accurate ground-based Aerosol Robotic Network (AERONET) aerosol data. Based on our post-processing approach, we propose a post-process-corrected high-resolution Sentinel-3 Synergy aerosol product, which gives a spectral estimate of the aerosol optical depth at five different wavelengths with a high spatial resolution equivalent to the native resolution of the Sentinel-3 Level-1 data (300 m at nadir). With aerosol data from Sentinel-3A and 3B satellites, we demonstrate that our approach produces high-resolution aerosol data with clearly better accuracy than the operational Sentinel-3 Level-2 Synergy aerosol product, and it also results in slightly better accuracy than the conventional fully learned machine learning approach. We also demonstrate better generalization capabilities of the post-process correction approach over the fully learned approach.
Abstract. Satellite-based aerosol retrievals provide global spatially distributed estimates of atmospheric aerosol parameters that are commonly needed in applications such as estimation of atmospherically corrected satellite data products, climate modeling and air quality monitoring. However, a common feature of the conventional satellite aerosol retrievals is that they have reasonably low spatial resolution and poor accuracy caused by uncertainty in auxiliary model parameters, such as fixed aerosol model parameters, and the approximate forward radiative transfer models utilized to keep the computational complexity feasible. As a result, the improvement and re-processing of the operational satellite data retrieval algorithms would become a tedious and computationally excessive problem. To overcome these problems, we have developed a machine learning-based post-process correction approach to correct the existing operational satellite aerosol data products. Our approach combines the existing satellite retrieval data and a post-processing step where a machine learning algorithm is utilized to predict the approximation error in the conventional retrieval. With approximation error we refer to the discrepancy between the true aerosol parameters and the ones retrieved using the satellite data. Our hypothesis is that the prediction of the approximation error with a finite training data set is a less complex and easier task than the direct fully learned machine learning based prediction in which the aerosol parameters are directly predicted given the satellite observations and measurement geometry. With our approach, there is no need to re-run the existing retrieval algorithms and only a computationally feasible post-processing step is needed. Our approach is based on neural networks trained based on collocated satellite data and accurate ground based AERONET aerosol data. Based on our post-processing approach, we propose a post-process corrected high resolution Sentinel-3 Synergy aerosol product, which gives a spectral estimate of the aerosol optical depth at five different wavelengths with a high spatial resolution equivalent to the native resolution of the Sentinel-3 level-1 data (300 meters at nadir). With aerosol data from Sentinel-3A and 3B satellites, we demonstrate that our approach produces high-resolution aerosol data with better accuracy than the operational Sentinel-3 level-2 Synergy aerosol product or a conventional fully learned machine learning approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.