Bacterial genomics has revolutionized our understanding of the microbial tree of life; however, mapping and visualizing the distribution of functional traits across bacteria remains a challenge. Here, we introduce AnnoTree—an interactive, functionally annotated bacterial tree of life that integrates taxonomic, phylogenetic and functional annotation data from over 27 000 bacterial and 1500 archaeal genomes. AnnoTree enables visualization of millions of precomputed genome annotations across the bacterial and archaeal phylogenies, thereby allowing users to explore gene distributions as well as patterns of gene gain and loss in prokaryotes. Using AnnoTree, we examined the phylogenomic distributions of 28 311 gene/protein families, and measured their phylogenetic conservation, patchiness, and lineage-specificity within bacteria. Our analyses revealed widespread phylogenetic patchiness among bacterial gene families, reflecting the dynamic evolution of prokaryotic genomes. Genes involved in phage infection/defense, mobile elements, and antibiotic resistance dominated the list of most patchy traits, as well as numerous intriguing metabolic enzymes that appear to have undergone frequent horizontal transfer. We anticipate that AnnoTree will be a valuable resource for exploring prokaryotic gene histories, and will act as a catalyst for biological and evolutionary hypothesis generation. AnnoTree is freely available at http://annotree.uwaterloo.ca
words):In December 2019, SARS-CoV-2 emerged causing the COVID-19 pandemic. SARS-CoV, the agent responsible for the 2003 SARS outbreak, utilizes ACE2 and TMPRSS2 host molecules for viral entry. ACE2 and TMPRSS2 have recently been implicated in SARS-CoV-2 viral infection.Additional host molecules including ADAM17, cathepsin L, CD147, and GRP78 may also function as receptors for SARS-CoV-2.To determine the expression and in situ localization of candidate SARS-CoV-2 receptors in the respiratory mucosa, we analyzed gene expression datasets from airway epithelial cells of 515 healthy subjects, gene promoter activity analysis using the FANTOM5 dataset containing 120 distinct sample types, single cell RNA sequencing (scRNAseq) of 10 healthy subjects, immunoblots on multiple airway epithelial cell types, and immunohistochemistry on 98 human lung samples.We demonstrate absent to low ACE2 promoter activity in a variety of lung epithelial cell samples and low ACE2 gene expression in both microarray and scRNAseq datasets of epithelial cell populations. Consistent with gene expression, rare ACE2 protein expression was observed in the airway epithelium and alveoli of human lung. We present confirmatory evidence for the presence of TMPRSS2, CD147, and GRP78 protein in vitro in airway epithelial cells and confirm broad in situ protein expression of CD147 in the respiratory mucosa.Collectively, our data suggest the presence of a mechanism dynamically regulating ACE2 expression in human lung, perhaps in periods of SARS-CoV-2 infection, and also suggest that alternate receptors for SARS-CoV-2 exist to facilitate initial host cell infection.In 2003, the severe acute respiratory syndrome (SARS) outbreak caused by the SARS coronavirus (CoV) resulted in 8096 probable cases with 774 confirmed deaths [1, 2] In patients with SARS, deaths were attributed to acute respiratory distress associated with diffuse bilateral pneumonia and alveolar damage [3]. In December 2019, SARS-CoV-2 emerged causing the COVID-19 pandemic. SARS-CoV-2 is spreading at a much more rapid rate than SARS-CoV [4][5][6]. Similar clinical reports of diffuse bilateral pneumonia and alveolar damage have been reported [7][8][9]. Severe cases of SARS-CoV-2 have been associated with infections of the lower respiratory tract with detection of the virus throughout this tissue as well as the upper respiratory tract [7][8][9]. The biological mechanisms that may govern differences in the number of SARS and COVID-19 cases remain undefined. It is possible that SARS-CoV-2 possesses distinct molecular mechanisms that impact the virulence through viral proteins, greater susceptibility of host cells to infection, permissivity of host cells to virus replication, or some combination of these and other potentially unknown factors [10][11][12][13]. Understanding SARS and SARS-CoV-2 virus similarities and differences at the molecular level in the host may provide insights into transmission, pathogenesis, and interventions.The seminal report identifying the receptor for SARS-CoV used a HEK29...
In December 2019, SARS-CoV-2 emerged causing the COVID-19 pandemic. SARS-CoV, the agent responsible for the 2003 SARS outbreak, utilises ACE2 and TMPRSS2 host molecules for viral entry. ACE2 and TMPRSS2 have recently been implicated in SARS-CoV-2 viral infection. Additional host molecules including ADAM17, cathepsin L, CD147, and GRP78 may also function as receptors for SARS-CoV-2.To determine the expression and in situ localisation of candidate SARS-CoV-2 receptors in the respiratory mucosa, we analysed gene expression datasets from airway epithelial cells of 515 healthy subjects, gene promoter activity analysis using the FANTOM5 dataset containing 120 distinct sample types, single cell RNA sequencing (scRNAseq) of 10 healthy subjects, proteomic datasets, immunoblots on multiple airway epithelial cell types, and immunohistochemistry on 98 human lung samples.We demonstrate absent to low ACE2 promoter activity in a variety of lung epithelial cell samples and low ACE2 gene expression in both microarray and scRNAseq datasets of epithelial cell populations. Consistent with gene expression, rare ACE2 protein expression was observed in the airway epithelium and alveoli of human lung, confirmed with proteomics. We present confirmatory evidence for the presence of TMPRSS2, CD147, and GRP78 protein in vitro in airway epithelial cells and confirm broad in situ protein expression of CD147 and GRP78 in the respiratory mucosa.Collectively, our data suggest the presence of a mechanism dynamically regulating ACE2 expression in human lung, perhaps in periods of SARS-CoV-2 infection, and also suggest that alternate receptors for SARS-CoV-2 exist to facilitate initial host cell infection.
Although gene-finding in bacterial genomes is relatively straightforward, the automated assignment of gene function is still challenging, resulting in a vast quantity of hypothetical sequences of unknown function. But how prevalent are hypothetical sequences across bacteria, what proportion of genes in different bacterial genomes remain unannotated, and what factors affect annotation completeness? To address these questions, we surveyed over 27 000 bacterial genomes from the Genome Taxonomy Database, and measured genome annotation completeness as a function of annotation method, taxonomy, genome size, 'research bias' and publication date. Our analysis revealed that 52 and 79 % of the average bacterial proteome could be functionally annotated based on protein and domain-based homology searches, respectively. Annotation coverage using protein homology search varied significantly from as low as 14 % in some species to as high as 98 % in others. We found that taxonomy is a major factor influencing annotation completeness, with distinct trends observed across the microbial tree (e.g. the lowest level of completeness was found in the Patescibacteria lineage). Most lineages showed a significant association between genome size and annotation incompleteness, likely reflecting a greater degree of uncharacterized sequences in 'accessory' proteomes than in 'core' proteomes. Finally, research bias, as measured by publication volume, was also an important factor influencing genome annotation completeness, with early model organisms showing high completeness levels relative to other genomes in their own taxonomic lineages. Our work highlights the disparity in annotation coverage across the bacterial tree of life and emphasizes a need for more experimental characterization of accessory proteomes as well as understudied lineages.
BackgroundMetagenomes provide access to the taxonomic composition and functional capabilities of microbial communities. Although metagenomic analysis methods exist for estimating overall community composition or metabolic potential, identifying specific taxa that encode specific functions or pathways of interest can be more challenging. Here we present MetAnnotate, which addresses the common question: “which organisms perform my function of interest within my metagenome(s) of interest?” MetAnnotate uses profile hidden Markov models to analyze shotgun metagenomes for genes and pathways of interest, classifies retrieved sequences either through a phylogenetic placement or best hit approach, and enables comparison of these profiles between metagenomes.ResultsBased on a simulated metagenome dataset, the tool achieves high taxonomic classification accuracy for a broad range of genes, including both markers of community abundance and specific biological pathways. Lastly, we demonstrate MetAnnotate by analyzing for cobalamin (vitamin B12) synthesis genes across hundreds of aquatic metagenomes in a fraction of the time required by the commonly used Basic Local Alignment Search Tool top hit approach.ConclusionsMetAnnotate is multi-threaded and installable as a local web application or command-line tool on Linux systems. Metannotate is a useful framework for general and/or function-specific taxonomic profiling and comparison of metagenomes.Electronic supplementary materialThe online version of this article (doi:10.1186/s12915-015-0195-4) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.