BackgroundProfiling of mRNA expression is an important method to identify biomarkers but complicated by limited correlations between mRNA expression and protein abundance. We hypothesised that these correlations could be improved by mathematical models based on measuring splice variants and time delay in protein translation.
MethodsWe characterised time-series of primary human naïve CD4 + T cells during early T-helper type 1 differentiation with RNA-sequencing and mass-spectrometry proteomics. We then performed computational time-series analysis in this system and in two other key human and murine immune cell types. Linear mathematical mixed time-delayed splice variant models were used to predict protein abundances, and the models were validated using out-of-sample predictions. Lastly, we re-analysed RNA-Seq datasets to evaluate biomarker discovery in five T-cell associated diseases, validating the findings for multiple sclerosis (MS) and asthma.
ResultsThe new models demonstrated median correlations of mRNA-to-protein abundance of 0.79-0.94, significantly out-performing models not including the usage of multiple splice variants and time-delays, as shown in cross-validation tests. Our mathematical models provided more differentially expressed proteins between patients and controls in all five diseases. Moreover, analysis of these proteins in asthma and MS supported their relevance. One marker, sCD27, was clinically validated in MS using two independent cohorts, for treatment response and prognosis.
ConclusionOur splice variant and time-delay models substantially improved the prediction of protein abundance from mRNA data in three immune cell-types. The models provided valuable biomarker candidates, which were validated in clinical studies of MS and asthma. We propose that our strategy is generally applicable for biomarker discovery.MG initiated and supervised the study. RM and OR performed bioinformatics analyses, and RM performed the modelling. These analyses were led by MG, CA, JT, and DGC. OR performed experimental work on T-cell differentiation, which were supervised by CEN, MCJ, JE and MB. MJK and CHN performed the proteomics analysis, which was supervised by MSK. FP and JM recruited patients and collected clinical material, and SH performed and analysed the biomarker validation assays, which were led by IK, MCJ, and JE. All authors contributed to and approved the final draft for publication. mRNA expression levels. Mol Biosyst 2009, 5: 1512-26. 7. Vogel C, Marcotte EM: Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat Rev Genet 2012, 13: 227-32. 8. Liu Y, Beyer A, Aebersold R: On the Dependency of Cellular Protein Levels on mRNA Abundance. Cell 2016, 165: 535-50. 9. Zhao J, Qin B, Nikolay R, Spahn CMT, Zhang G: Translatomics: The Global View of Translation. Int J Mol Sci 2019, 20. 10. Wethmar K, Smink JJ, Leutz A: Upstream open reading frames: molecular switches in (patho)physiology. Bioessays 2010, 32: 885-93. 11. Floor SN, Doudna JA: Tunable protein synthesi...