Although we now routinely sequence human genomes, we can confidently identify only a fraction of the sequence variants that have a functional impact. Here we developed a deep mutational scanning framework that produces exhaustive maps for human missense variants by combining random codon-mutagenesis and multiplexed functional variation assays with computational imputation and refinement. We applied this framework to four proteins corresponding to six human genes: UBE2I (encoding SUMO E2 conjugase), SUMO1 (small ubiquitin-like modifier), TPK1 (thiamin pyrophosphokinase), and CALM1/2/3 (three genes encoding the protein calmodulin). The resulting maps recapitulate known protein features, and confidently identify pathogenic variation. Assays potentially amenable to deep mutational scanning are already available for 57% of human disease genes, suggesting that DMS could ultimately map functional variation for all human disease genes.
The clinical phenotype of zoonotic tuberculosis, its contribution to the global burden of disease and prevalence are poorly understood and probably underestimated. This is partly because currently available laboratory and in silico tools have not been calibrated to accurately identify all subspecies of the Mycobacterium tuberculosis complex ( Mtbc ). We here present the first such tool, SNPs to Identify TB ('SNP-IT'). Applying SNP-IT to a collection of clinical genomes from a UK reference laboratory, we demonstrate an unexpectedly high number of M. orygis isolates. These are seen at a similar rate to M. bovis which attracts much health protection resource and yet M. orygis cases have not been previously described in the UK. From an international perspective it is possible that M. orygis is an underestimated zoonosis. As whole genome sequencing is increasingly integrated into the clinical setting, accurate subspecies identification with SNP-IT will allow the clinical phenotype, host range and transmission mechanisms of subspecies of the Mtbc to be studied in greater detail.
Integrative analysis that combines genome-wide association data with expression quantitative trait analysis and network representation may illuminate causal relationships between genes and diseases. To identify causal lipid genes, we utilized genotype, gene expression, protein-protein interaction networks, and phenotype data from 5,257 Framingham Heart Study participants and performed Mendelian randomization to investigate possible mechanistic explanations for observed associations. We selected three putatively causal candidate genes (ABCA6, ALDH2, and SIDT2) for lipid traits (LDL cholesterol, HDL cholesterol and triglycerides) in humans and conducted mouse knockout studies for each gene to confirm its causal effect on the corresponding lipid trait. We conducted the RNA-seq from mouse livers to explore transcriptome-wide alterations after knocking out the target genes. Our work builds upon a lipid-related gene network and expands upon it by including protein-protein interactions. These resources, along with the innovative combination of emerging analytical techniques, provide a groundwork upon which future studies can be designed to more fully understand genetic contributions to cardiovascular diseases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.