34Indonesia is the world's fourth most populous country, host to striking levels of human diversity, regional 35 patterns of admixture, and varying degrees of introgression from both Neanderthals and Denisovans. 36 However, it has been largely excluded from the human genomics sequencing boom of the last decade. 37 To serve as a benchmark dataset of molecular phenotypes across the region, we generated genome-wide 38 CpG methylation and gene expression measurements in over 100 individuals from three locations that 39 capture the major genomic and geographical axes of diversity across the Indonesian archipelago.
40Investigating between-and within-island differences, we find up to 10% of tested genes are differentially 41 expressed between the islands of Mentawai (Sumatra) and New Guinea. Variation in gene expression is 42 closely associated with DNA methylation, with expression levels of 9.7% of genes strongly correlating 43 with nearby CpG methylation, and many of these genes being differentially expressed between islands.
44Genes identified in our differential expression and methylation analyses are enriched in pathways 45 involved in immunity, highlighting Indonesia tropical role as a source of infectious disease diversity and 46 the strong selective pressures these diseases have exerted on humans. Finally, we identify robust within-47 island variation in DNA methylation and gene expression, likely driven by very local environmental 48 differences across sampling sites. Together, these results strongly suggest complex relationships between 49 DNA methylation, transcription, archaic hominin introgression and immunity, all jointly shaped by the 50 environment. This has implications for the application of genomic medicine, both in critically 51 understudied Indonesia and globally, and will allow a better understanding of the interacting roles of 52 genomic and environmental factors shaping molecular and complex phenotypes. 53 54 55 56 57 58 59 60 61 Modern human genomics does not equitably represent the full breadth of humanity. While genome 62 sequences for people of European descent now number a million or more, most of the world is deeply 63 understudied 1 . This is particularly true of Indonesia 2 , a country geographically as large as continental 64 Europe and the world's fourth largest by population. Genomic diversity in Indonesia is strikingly 65 different to other well-characterized East Asian populations, such as Han Chinese and Japanese, but this 66 diversity is not captured in large global datasets like the 1000 Genomes Project 3 or the Simons Genome 67 Diversity Project 4 . The first Indonesian genome sequences were only reported in 2016 5 with the first 68 representative survey of diversity across the archipelago only appearing in 2019 6 . This extreme lack of 69 representation extends to molecular phenotypes. To our knowledge, only one genome-wide gene 70 expression study has been published 7 from the region, focused exclusively on host-pathogen interactions 71 with P. falciparum. There are...