24Schizophrenia genome-wide association studies highlight the substantial contribution of risk 25 attributed to the non-coding genome where human endogenous retroviruses (HERVs) are encoded.
26These ancient viral elements have previously been overlooked in genetic and transcriptomic studies 27 due to their poor annotation and repetitive nature. Using a new, comprehensive HERV annotation, 28 we found that the fraction of the genome where HERVs are located (the 'retrogenome') is enriched 29 for schizophrenia risk variants, and that there are 148 disparate HERVs involved in susceptibility.
30Analysis of RNA-sequencing data from the dorsolateral prefrontal cortex of 259 schizophrenia cases 31 and 279 controls from the CommonMind Consortium showed that HERVs are actively expressed in 32 the brain (n = 3,979), regulated in cis by common genetic variants (n = 1,759), and differentially 33 expressed in patients (n = 81). Convergent analyses implicate LTR25_6q21 and ERVLE_8q24.3h 34 as HERVs of etiological relevance to schizophrenia, which are co-regulated with genes involved in 35 neuronal and mitochondrial function, respectively. Our findings provide a strong rationale for 36 exploring the retrogenome and the expression of these locus-specific HERVs as novel risk factors 37 for schizophrenia and potential diagnostic biomarkers and treatment targets.
39 42These viruses multiplied through a copy-and-paste mechanism and were eventually endogenized 43 (i.e., vertically transmitted), and now constitute approximately 8% of the genome 1,2 . These repetitive 44 sequences were generally assumed to be transcriptionally inactive in the modern genome, having a 45 purely regulatory function due to the retainment of the viral promoters (long-terminal repeats, LTRs).
46However, certain HERVs were co-opted to serve novel specialized roles, including in the regulation 47 of embryonic development 3,4 and neural progenitor cells 5,6 . They have also been implicated in 48 neuropsychiatric conditions such as amyotrophic lateral sclerosis 7-9 , major depressive disorder, 49 bipolar disorder, and schizophrenia 10-12 . Despite their abundance in the genome and relevance to 50 disease and fundamental aspects of human biology, the location and function of most HERVs remain 51 elusive to-date.
53Recent large genome-wide association studies (GWAS) comparing schizophrenia cases and non-54 affected individuals enabled the identification of polymorphisms mediating risk for this disorder [13][14][15] .
55These studies highlight the substantial contribution of risk attributed to the non-coding genome,
56where HERVs are encoded, which are often overlooked in both genomic and transcriptomic 57 studies 16 . Until recently, there was no comprehensively annotated map of HERVs in the genome, 58 and there were no computationally efficient tools to analyze the expression of these repetitive 59 sequences with single-locus resolution. Consequently, previous studies were unable to test whether 60 locus-specific HERVs were genetically associated with traits of int...