Genome-wide association studies have revealed numerous risk loci associated with diverse diseases. However, identification of disease-causing variants within association loci remains a major challenge. Divergence in gene expression due to cis-regulatory variants in noncoding regions is central to disease susceptibility. We show that integrative computational analysis of phylogenetic conservation with a complexity assessment of co-occurring transcription factor binding sites (TFBS) can identify cis-regulatory variants and elucidate their mechanistic role in disease. Analysis of established type 2 diabetes risk loci revealed a striking clustering of distinct homeobox TFBS. We identified the PRRX1 homeobox factor as a repressor of PPARG2 expression in adipose cells and demonstrate its adverse effect on lipid metabolism and systemic insulin sensitivity, dependent on the rs4684847 risk allele that triggers PRRX1 binding. Thus, cross-species conservation analysis at the level of co-occurring TFBS provides a valuable contribution to the translation of genetic association signals to disease-related molecular mechanisms.
Genome-wide association studies identified numerous disease risk loci. Delineating molecular mechanisms influenced by cis-regulatory variants is essential to understand gene regulation and ultimately disease pathophysiology. Combining bioinformatics and public domain chromatin information with quantitative proteomics supports prediction of cis-regulatory variants and enabled identification of allele-dependent binding of both, transcription factors and coregulators at the type 2 diabetes associated PPARG locus. We found rs7647481A nonrisk allele binding of Yin Yang 1 (YY1), confirmed by allele-specific chromatin immunoprecipitation in primary adipocytes. Quantitative proteomics also found the coregulator RING1 and YY1 binding protein (RYBP) whose mRNA levels correlate with improved insulin sensitivity in primary adipose cells carrying the rs7647481A nonrisk allele. Our findings support a concept with diverse cis-regulatory variants contributing to disease pathophysiology at one locus. Proteome-wide identification of both, transcription factors and coregulators, can profoundly improve understanding of mechanisms underlying genetic associations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.