36Background: Molecular multi-omics data provide an in-depth view on biological 37 systems, and their integration is crucial to gain insights in complex regulatory processes. 38 These data can be used to explain disease related genetic variants by linking them 39 to intermediate molecular traits (quantitative trait loci, QTL). Molecular networks 40 regulating cellular processes leave footprints in QTL results as so-called trans -QTL 41 hotspots. Reconstructing these networks is a complex endeavor and use of biological 42 prior information has been proposed to alleviate network inference. However, previous 43 efforts were limited in the types of priors used or have only been applied to model 44 systems. In this study, we reconstruct the regulatory networks underlying trans -QTL 45 hotspots using human cohort data and data-driven prior information.
46Results: We devised a strategy to integrate QTL with human population scale 47 multi-omics data and comprehensively curated prior information from large-scale bio-48 logical databases. State-of-the art network inference methods applied to these data and 49 priors were used to recover the regulatory networks underlying trans -QTL hotspots. We 50 benchmarked inference methods and showed, that Bayesian strategies using biologically-51 informed priors outperform methods without prior data in simulated data and show 52 better replication across datasets. Application of our approach to human cohort data 53 highlighted two novel regulatory networks related to schizophrenia and lean body mass 54 for which we generated novel functional hypotheses. 55 Conclusion: We demonstrate, that existing biological knowledge can be leveraged 56 for the integrative analysis of networks underlying trans associations to deduce novel 57 hypotheses on cell regulatory mechanisms. 58 machine learning, personalized medicine 60 2 Background 61Genome-wide associations studies (GWAS) have been tremendously successful in discover-62 ing disease associated genetic loci. However, establishing causality or obtaining functional 63 explanations for GWAS SNPs is still challenging. In recent years, the focus has shifted from 64 discovery of disease loci to mechanism and explanation, and large efforts have been put 65 into unravelling the functional consequences of GWAS SNPs [1, 2]. These have been made 66 possible through technological advances in measuring genome-wide molecular data in large 67 population cohorts, which further led to a steady increase in biological resources providing 68 simultaneous measurements of different molecular layers (often termed multi-omics data). 69 To elucidate disease mechanisms, systems genetics approaches seek to link GWAS SNPs to 70 intermediate molecular traits by identifying quantitative trait loci (QTL) [3, 4], for example 71 for gene expression levels (eQTL) [5][6][7] or DNA methylation at CpG dinucleotides (meQTL) 72 [8][9][10]. 73 Genetic variants that are QTL for quantitative molecular phenotypes that reside on a 74 different chromosome are called t...