18The prevalence of type 2 diabetes mellitus (T2DM) is expected to increase rapidly in the next 19 decades, posing a major challenge to societies worldwide. The emerging era of precision 20 medicine calls for the discovery of biomarkers of clinical value for prediction of disease 21 onset, where causal biomarkers can furthermore provide actionable targets. Blood-based 22 factors like serum proteins are in contact with every organ in the body to mediate global 23 homeostasis and may thus directly regulate complex processes such as aging and the 24 development of common chronic diseases. We applied a data-driven proteomics approach 25 measuring serum levels of 4,137 proteins in 5,438 Icelanders to discover novel biomarkers for 26 incident T2DM and describe the serum protein profile of prevalent T2DM. We identified 536 27 proteins associated with incident or prevalent T2DM. Through LASSO penalized logistic 28 regression analysis combined with bootstrap resampling, a panel of 20 protein biomarkers that 29 accurately predicted incident T2DM was identified with a significant incremental 30 improvement over traditional risk factors. Finally, a Mendelian randomization analysis 31 provided support for a causal role of 48 proteins in the development of T2DM, which could 32 be of particular interest as novel therapeutic targets. 33 34 for age and sex, we identified 520 unique proteins that were significantly associated with 84 prevalent T2DM after Bonferroni correction for multiple hypothesis testing (P adj < 0.05), with 85 the strongest associations observed for ARFIP2, MXRA8 and CPM ( Fig. 1a, Table S2). In a 86 second model including adjustment for body mass index (BMI), 322 proteins remained 87 statistically significant ( Table S2). Many of the proteins were inter-correlated, with pairwise 88 Pearson's r ranging from -0.60 to 0.97 ( Fig. S2a). A pathway and gene ontology (GO) 89 enrichment analysis of all 520 proteins associated with prevalent T2DM revealed an 90 enrichment of proteins involved in extracellular matrix (ECM)-receptor interaction, 91 complement and coagulation cascades, metabolic processes and extracellular region (Fig. 92 S3a, Table S3). We furthermore found the genes encoding the 520 prevalent T2DM-93 associated proteins to be enriched for high expression in liver, followed by other tissues that 94 included kidney, gastrointestinal tract and pancreas ( Fig. S4a). Thus, the diabetic state is 95 reflected in a major shift in the serum proteome that is involved in metabolic, inflammatory 96 and ECM processes. 97 98 Serum protein profile of incident T2DM 99The serum protein profiles of T2DM patients observed in the cross-sectional analysis 100 described above may represent shifts that occurred either before or after the onset of the 101 disease. To identify serum protein signatures that preceded the onset of T2DM, we next 102 focused our analysis on the 2,940 non-diabetic AGES participants who participated in a 103 second study visit (AGESII) 5-years after the baseline visit, of which 112 develop...