1 2 Although multiple studies have addressed the effects of codon usage on gene 3 expression, such studies were typically performed in unspliced model genes. In 4 the human genome, most genes undergo splicing and patterns of codon usage are 5 splicing-dependent: guanine and cytosine (GC) content is highest within single-6 exon genes and within first exons of multi-exon genes. Intrigued by this 7 observation, we measured the effects of splicing on expression in a panel of 8 synonymous variants of GFP and mKate2 reporter genes that varied in 9 nucleotide composition. We found that splicing promotes the expression of 10 adenine and thymine (AT)-rich variants by increasing their steady-state protein 11 and mRNA levels, in part through promoting cytoplasmic localization of mRNA. 12Splicing had little or no effect on the expression of GC-rich variants. In the 13 absence of splicing, high GC content at the 5' end, but not at the 3' end of the 14 coding sequence positively correlated with expression. Among endogenous 15 human protein-coding transcripts, GC content has a more positive effect on 16 various expression measures of unspliced, relative to spliced mRNAs. We 17 propose that splicing promotes the expression of AT-rich genes, leading to 18 selective pressure for the retention of introns in the human genome. 19 20 Kudla et al., 2006;Zolotukhin et al., 1996). As a result, increasing the GC content 48 of transgenes has become a common strategy in coding sequence optimization 49 for heterologous expression in human cells (Fath et al., 2011). On the other hand, 50 genome-wide analyses of endogenous genes typically show little or no 51 correlation of GC content with expression (Duan et al., 2013;Lercher et al., 2003; 52 Rudolph et al., 2016;Semon et al., 2005). 53 4 54 We hypothesized that the conflicting results in heterologous and endogenous 55 gene expression studies can be partially explained by RNA splicing. Most 56 transgenes used in heterologous expression systems have no introns, whereas 57 97% of genes in the human genome contain one or more introns. Splicing is 58 known to influence gene expression at multiple stages, including nuclear RNP 59 assembly, RNA export, and translation. If splicing selectively increased the 60 expression of AT-rich genes, it could account for the lack of correlation of GC 61 content and gene expression in previous genome-wide studies. We therefore 62 compared spliced and unspliced genes with respect to their (1) genomic codon 63 usage, (2) expression levels of reporter genes in transient and stable transfection 64 experiments and (3) global expression patterns in human transcriptome studies. 65 We show that splicing increases the expression of AT-rich genes, but not GC-rich 66 genes, in part through effects on cytoplasmic RNA enrichment. 67 68 Results 69 70 Codon usage of human protein-coding genes depends on RNA splicing 71We first analysed the relationship between the nucleotide composition of human 72 genes and splicing. GC4 content (GC content at 4-fold degenerate site...