MGnify: the microbiome analysis resource in 2020

Mitchell, Alex L.; Almeida, Alexandre; Beracochea, Martín; Burgin, Josephine; Cochrane, Guy; Crusoe, Michael R.; Kale, Varsha; Potter, Simon; Richardson, Lorna; Sakharova, Ekaterina; Scheremetjew, Maxim; Korobeynikov, Anton; Shlemov, Alexander; Kunyavskaya, Olga; Lapidus, Alla; Finn, Robert D.

doi:10.1093/nar/gkz1035

Cited by 459 publications

(555 citation statements)

References 54 publications

Supporting

Mentioning

553

Contrasting

Unclassified

Order By: Relevance

“…Enterotypes were visualised by the PCoA plot. To nd the sequences assigned to haloarchaea in other sample cohorts, we trawled both the publicly available human metagenomic and metataxonomic datasets using the EBI MGnify [62]. All the data were obtained from the EBI MGnify database.…”

Section: Library Preparation and Sequencing Of Archaeal 16s Rrna Genementioning

confidence: 99%

The human gut archaeome: identification of diverse haloarchaea in Korean subjects

Kim¹,

Whon²,

Lim

et al. 2020

Preprint

View full text Add to dashboard Cite

Background: Archaea are one of the least-studied members of the gut-dwelling autochthonous microbiota. Few studies have reported the dominance of methanogens in the archaeal microbiome (archaeome) of the human gut, although limited information regarding the diversity and abundance of other archaeal phylotypes is available.Results: We surveyed the archaeome of faecal samples collected from 897 East Asian subjects living in South Korea. In total, 42.47% faecal samples were positive for archaeal colonisation; these were subsequently subjected to archaeal 16S rRNA gene deep sequencing and real-time quantitative polymerase chain reaction-based abundance estimation. The mean archaeal relative abundance was 10.24% ± 4.58% of the total bacterial and archaeal abundance. We observed extensive colonisation of haloarchaea (95.54%) in the archaea-positive faecal samples, with 9.63% mean relative abundance in archaeal communities. Haloarchaea were relatively more abundant than methanogens in some samples. The presence of haloarchaea was also verified by fluorescence in situ hybridisation analysis. Owing to large inter-individual variations, we categorised the human gut archaeome into four archaeal enterotypes.Conclusions: The study demonstrated that the human gut archaeome is indigenous, responsive, and functional, expanding our understanding of the archaeal signature in the gut of human individuals.

show abstract

Section: Library Preparation and Sequencing Of Archaeal 16s Rrna Genementioning

confidence: 99%

The human gut archaeome: identification of diverse haloarchaea in Korean subjects

Kim¹,

Whon²,

Lim

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Essentially the genome annotation script parses the kAAmer results and produces a GFF (General Feature Format) annotation file giving some threshold on the protein homology. The other use case is the profiling of a metagenome based on the MGnify database of the human gut (24). MGnify includes protein annotations from gene ontology, enzyme commission and kegg pathways, among others.…”

Section: Database Buildingmentioning

confidence: 99%

Fast protein database as a service with kAAmer

Déraspe

Boisvert²,

Laviolette

et al. 2020

Preprint

View full text Add to dashboard Cite

Identification of proteins is one of the most computationally intensive steps in genomics studies. It usually relies on aligners that don't accommodate rich information on proteins and require additional pipelining steps for protein identification. We introduce kAAmer, a protein database engine based on aminoacid k-mers, that supports fast identification of proteins with complementary annotations. Moreover, the databases can be hosted and queried remotely. genomics | database | k-mers | proteins | comparative genomics | metagenomicsCorrespondence: jacques.corbeil@fmed.ulaval.ca MainOne fundamental task in genomics is the identification and annotation of DNA coding regions that translate into proteins via a genetic code. Protein databases increase in size as new variants, orthologous and paralogous genes are being sequenced. This is particularly true within the microbial world where bacterial proteomes' diversity follows their rapid evolution. For instance, UniProtKB (Swiss-Prot / TrEMBL) (1) and NCBI RefSeq (2) contain over 100 million bacterial proteins and that number grows rapidly. Identification of proteins often relies on accurate, but slow, alignment software such as BLAST or hidden Markov model (HMM) profiles (3,4). Although other approaches (such as DIAMOND (5)) have considerably improved the speed of searching proteins in large datasets, from a database standpoint much can be done to offer a more versatile experience. One such approach would be to expose the database as a permanent service making use of computational resources for increased performance (i.g. memory mapping) and leveraging the cloud for remote analyses via a Web API. Another approach would be to extend the result set with comprehensive information on protein targets to facilitate subsequent genomics and metagenomics analysis pipelines. Alignment software usually relies on a seed-and-extend pattern using an index (two-way indexing in DIAMOND) to make local alignments between query and target sequences. However, there is a plethora of research techniques to bypass the computational cost of alignment. Alignment-free sequence analyses usually adopt k-mers (overlapping subsequences of length k) as the main element of quantification. They are extensively used in DNA sequence analyses ranging from genome assemblies (6) to genotyping variants (7), as well as genomics and metagenomics classification (8-10). In the present study, we introduce kAAmer, a fast and comprehensive protein database engine that was named after the usage of amino acid k-mers which differs from the usual nucleic acid k-mers. We demonstrate the usefulness and efficiency of our approach in protein identification from a large dataset and antibiotic resistance gene identification from a pan-resistant bacterial genome. The database engine of kAAmer is based on log-structured merge-tree (LSM-tree) Key-Value (KV) stores (11). LSMtrees are used in data-intensive operations such as web indexing (12, 13), social networking (14) and online gaming (15,16). KAAmer uses Badger (17), an ef...

show abstract

“…Enterotypes were visualized by the PCoA plot. To find the sequences assigned to haloarchaea in other sample cohorts, we trawled both the publicly available human metagenomic and metataxonomic datasets using the EBI MGnify [60]. All the data were obtained from the EBI MGnify database.…”

Section: Analysis Of 16s Rrna Gene Sequence Datamentioning

confidence: 99%

The human gut archaeome: identification of diverse haloarchaea in Korean subjects

Kim

Whon

Lim

et al. 2020

Preprint

View full text Add to dashboard Cite

Background Archaea are the least-studied members of the gut-dwelling autochthonous microbiota. Few studies have reported the exclusive dominance of methanogens in the archaeal microbiome (archaeome) of the human gut, although information regarding the diversity and abundance of other archaeal phylotypes is limited. Results We surveyed the archaeome in the faecal samples collected from 897 normal East Asian subjects living in South Korea. In total, 42.47% faecal samples were positive for archaeal colonization, which were subsequently subjected to archaeal 16S rRNA gene deep sequencing and abundance estimation. The mean archaeal abundance was 9.89 ± 4.48% of the total bacterial and archaeal abundance. We observed extensive colonization of haloarchaea (95.53%) in the Korean gut, with 9.63% mean relative abundance in archaeal communities. Haloarchaea were relatively more abundant than methanogens in certain samples. The presence of haloarchaea was also verified by fluorescence in situ hybridization analysis. Owing to large inter-individual variations, we categorized the human gut archaeome into four archaeal enterotypes. Conclusions The study demonstrated that the human gut archaeome is indigenous, responsive, and functional, expanding our understanding of the archaeal signature in the gut of normal healthy individuals.

show abstract

MGnify: the microbiome analysis resource in 2020

Cited by 459 publications

References 54 publications

The human gut archaeome: identification of diverse haloarchaea in Korean subjects

The human gut archaeome: identification of diverse haloarchaea in Korean subjects

Fast protein database as a service with kAAmer

The human gut archaeome: identification of diverse haloarchaea in Korean subjects

Contact Info

Product

Resources

About