Introduction Genome wide association studies (GWAS) have identified loci associated with risk for non‐syndromic cleft lip with or without cleft palate (NSCL/P). At every locus there are multiple single nucleotide polymorphisms (SNPs) associated with disease and it is a challenge to distinguish functional SNPs from those merely in linkage disequilibrium with functional SNPs. Objective At many of the associated loci, the presumed risk‐relevant gene is expressed in oral epithelium. We hypothesize that at such loci functional SNPs have allele‐specific effects on the activity of oral epithelium enhancers. Our objective is to identify the candidate functional SNPs for NSCL/P at each of the loci identified by GWAS. Materials and Methods To test SNPs for their effects on enhancer activity we carry out massively parallel reporter assays (MPRAs) on 890 SNPs from 8 loci in the GMSM‐K human fetal oral epithelium cell line. To test the SNP‐target gene association we use the Activity‐by‐Contact method, which incorporates open chromatin, H3K27Ac, and HiC data. To confirm MPRA results on top‐scoring SNPs we perform luciferase reporter assays. We use CRISPR‐mediated homology directed repair to engineer GMSM‐K cells to be homozygous for risk or non‐risk alleles for promising SNPs and assess the expression level of the target gene in engineered cells of each genotype. Assays that pass these assays are tested in allele‐specific reporter assays in zebrafish and mouse embryos. Results Using the described methods we tested 608 NSCL/P‐associated SNPs in the IRF6 locus and 11 such SNPs in the FOXE1 locus. Two SNPs in the first group and one in the second had consistent allele‐specific effects in the MPRA and in luciferase assays. Importantly, for one SNP in the IRF6 locus, we have engineered cells to be homozygous for the non‐risk‐associated or risk‐associated allele and found expression of IRF6 is lower in the latter. In vivo enhancer reporter assays in mouse and zebrafish embryos are ongoing. Conclusion Out of more than 600 NSCL/P‐risk‐associated SNPs in the IRF6 locus, we have evidence that a particular one directly contributes to pathogenesis for this disorder. Significance This study illustrates a method to screen among multiple non‐coding SNPs identified by GWAS to identify candidates for those that are functional. Moreover, identifying the functional SNP near IRF6 is a step towards illuminating how a common variant contributes risk for this disorder.
Early embryonic epidermis and oral epithelium in mammals stratify to generate outer protective layer, called the periderm, which prevents water loss and interepithelial adhesions among limbs and oral structures. Orofacial cleft is a structural birth defect which results from the mutations in transcription factors regulating oral periderm differentiation. In zebrafish, enveloping layer (EVL), also called periderm, is a simple squamous epithelial monolayer which arises from blastomeres. EVL differentiation shares many regulatory transcription factors (TFs) with mammalian periderm differentiation including Irf6 and Grhl3. Therefore, we use zebrafish EVL as a tractable model for the difficult‐to‐access mammalian oral periderm. We previously carried out ATAC‐seq in isolated periderm and flow‐through cells and found that the Grhl3, Klf17, Tfap2a, Cebpb, and Gata3 binding sites (TFBS) were enriched in periderm‐specific ATAC‐seq peaks, implicating these TFs in the transcriptional regulatory network (TRN) driving the periderm differentiation. In this study, we applied a computational method to infer the structure of the zebrafish periderm TRN, and, for use in tuning the parameters of the algorithm, we generated a reduced‐representation network model based on biologically‐validated edges. We applied a network‐modeling algorithm called “modified least absolute shrinkage and selection operator with stability approach to regularization selection” (mLASSO‐StARS) to publicly‐available zebrafish single‐cell RNA‐seq datasets and to “prior” evidence of TF‐gene interactions based on the presence of TFBS in enhancers associated with their proximal genes. To generate a reduced‐representation network comprised of biologically‐validated edges we performed RNA‐seq on zebrafish wild‐type and irf6 maternal mutant embryos at 6 hours post fertilization (hpf). Next, in an addback paradigm, we injected into irf6 maternal mutant embryos at the one cell stage mRNA encoding transcription factors Irf6, Grhl3, Klf17, Tfap2a, Cebpb, and Gata3 (separately), and at 6 hpf harvested RNA and profiled expression of 94 genes using nanoString probes. Partially overlapping sets of genes were rescued by each TF. To identify the direct targets of Grhl3, IRF6 and Tfap2a, we performed ChIP‐seq and CUT&RUN‐seq. Results from the addback paradigm and CUT&RUN assays were compiled to generate a gold standard network for periderm differentiation. Finally, we used gold standard network to test the consequence of varying input parameters on the performance of computationally‐derived TRN. Satisfyingly, adding the prior information to the co‐expression network improved the TRN model significantly. The resulting TRN provide a better understanding of periderm differentiation in zebrafish, will identify key hub genes in the network, which in turn may help in prioritization of candidates identified in genomic analyses of orofacial cleft patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.