The isolated type of orofacial cleft, termed non-syndromic cleft lip with or without cleft palate (NSCL/P), is the second most common birth defect in China, with Asians having the highest incidence in the world. NSCL/P involves multiple genes and complex interactions between genetic and environmental factors, imposing difficulty for the genetic assessment of the unborn fetus carrying multiple NSCL/P-susceptible variants. Although genome-wide association studies (GWAS) have uncovered dozens of single nucleotide polymorphism (SNP) loci in different ethnic populations, the genetic diagnostic effectiveness of these SNPs requires further experimental validation in Chinese populations before a diagnostic panel or a predictive model covering multiple SNPs can be built. In this study, we collected blood samples from control and NSCL/P infants in Han and Uyghur Chinese populations to validate the diagnostic effectiveness of 43 candidate SNPs previously detected using GWAS. We then built predictive models with the validated SNPs using different machine learning algorithms and evaluated their prediction performance. Our results showed that logistic regression had the best performance for risk assessment according to the area under curve. Notably, defective variants in MTHFR and RBP4, two genes involved in folic acid and vitamin A biosynthesis, were found to have high contributions to NSCL/P incidence based on feature importance evaluation with logistic regression. This is consistent with the notion that folic acid and vitamin A are both essential nutritional supplements for pregnant women to reduce the risk of conceiving an NSCL/P baby. Moreover, we observed a lower predictive power in Uyghur than in Han cases, likely due to differences in genetic background between these two ethnic populations. Thus, our study highlights the urgency to generate the HapMap for Uyghur population and perform resequencing-based screening of Uyghur-specific NSCL/P markers.
Pan-genome refers to the sum of genes that can be found in a given bacterial species, including the core-genome and the dispensable genome. In this study, the genomes from 183 Streptococcus mutans (S. mutans) isolates were analyzed from the pan-genome perspective. This analysis revealed that S. mutans has an "open" pan-genome, implying that there are plenty of new genes to be found as more genomes are sequenced. Additionally, S. mutans has a limited core-genome, which is composed of genes related to vital activities within the bacterium, such as metabolism and hereditary information storage or processing, occupying 35.6 and 26.6% of the core genes, respectively. We estimate the theoretical core-genome size to be about 1083 genes, which are fewer than other Streptococcus species. In addition, core genes suffer larger selection pressures in comparison to those that are less widely distributed. Not surprisingly, the distribution of putative virulence genes in S. mutans strains does not correlate with caries status, indicating that other factors are also responsible for cariogenesis. These results contribute to a more understanding of the evolutionary characteristics and dynamic changes within the genome components of the species. This also helps to form a new theoretical foundation for preventing dental caries. Furthermore, this study sets an example for analyzing large genomic datasets of pathogens from the pan-genome perspective.
Background Nonsyndromic cleft lip with or without cleft palate (NSCL/P) is the most common craniofacial birth defect. Its etiology is complex and it has a lifelong influence on affected individuals. Despite many studies, the pathogenic gene alleles are not completely clear. Here, we recruited a Chinese NSCL/P family and explored the candidate causative variants in this pedigree. Methods We performed whole‐exome sequencing on two patients and two unaffected subjects of this family. Variants were screened based on bioinformatics analysis to identify the potential etiological alleles. Species conservation analysis, mutation function prediction, and homology protein modeling were also performed to preliminarily evaluate the influence of the mutations. Results We identified three rare mutations that are located on a single chromatid (c.2684C > T_p.Ala895Val, c.4350G > T_p.Gln1450His, and c.4622C > A_p.Ser1541Tyr) in GLI2 as candidate causative variants. All of these three mutations were predicted to be deleterious, and they affect amino acids that are conserved in many species. The mutation c.2684C > T was predicted to affect the structure of the GLI2 protein. Conclusion Our results further demonstrate that GLI2 variants play a role in the pathogenesis of NSCL/P, and the three rare missense mutations combined are probably the potential disease‐causing variants in this family.
24Several studies have documented the diversity and potential pathogenic associations of 25 organisms in the human oral cavity. Although much progress has been made in 26 understanding the complex bacterial community inhabiting the human oral cavity, our 27 understanding of some microorganisms is less resolved due to a variety of reasons. One 28 such little-understood group is the candidate phyla radiation (CPR), which is a recently 29 identified, but highly abundant group of ultrasmall bacteria with reduced genomes and 30 unusual ribosomes. Here, we present a computational protocol for the detection of CPR 31 organisms from metagenomic data. Our approach relies on a self-constructed dataset 32 comprising published CPR genomic sequences as a filter to identify CPR sequences from 33 metagenomic sequencing data. After assembly and functional prediction, the taxonomic 34 affiliation of CPR contigs can be identified through phylogenetic analysis with publically 35 available 16S rRNA gene and ribosomal proteins, in addition to sequence similarity 36 analyses (e.g., average nucleotide identity calculations and contig mapping). Using this 37 protocol, we reconstructed two draft genomes of organisms within the TM7 superphylum, 38 that had genome sizes of 0.594 Mb and 0.678 Mb. Among the predicted functional genes 39 of the constructed genomes, a high percentage were related to signal transduction, cell 40 motility, and cell envelope biogenesis, which could contribute to cellular morphological 41 changes in response to environmental cues.42 3 Importance 43 Candidate phyla radiation (CPR) bacterial group is a recently identified, but highly 44 diverse and abundant group of ultrasmall bacteria exhibiting reduced genomes and 45 limited metabolic capacities. A number of studies have reported their potential pathogenic 46 associations in multiple mucosal diseases including periodontitis, halitosis, and 47 inflammatory bowel disease. However, CPR organisms are difficult to cultivate and are 48 difficult to detect with PCR-based methods due to divergent genetic sequences. Thus, our 49 understanding of CPR has lagged behind that of other bacterial component. Here, we 50 used metagenomic approaches to overcome these previous barriers to CPR identification, 51 and established a computational protocol for detection of CPR organisms from 52 metagenomic samples. The protocol describe herein holds great promise for better 53 understanding the potential biological functioning of CPR. Moreover, the pipeline could 54 be applied to other organisms that are difficult to cultivate. 55 56 Keywords: candidate phyla radiation (CPR), metagenomics, bioinformatics, 57 computational protocol 58 59 60 61 62 63 4 Introduction 64The human oral cavity is one of the five primary microbial microecosystems within 65 humans, and has been used as a model system for microbiome research (1). Dysbiosis of 66 the oral microbial community has been observed in relation to some common systemic 67 diseases including rheumatoid arthritis (2) and type 2 diab...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.