2019
DOI: 10.1101/760207
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Pre- and post-sequencing recommendations for functional annotation of human fecal metagenomes

Abstract: BackgroundShotgun metagenomes are often assembled prior to annotation of genes which biases the functional capacity of a community towards its most abundant members. For an unbiased assessment of community function, short reads need to be mapped directly to a gene or protein database. The ability to detect genes in short read sequences is dependent on pre-and postsequencing decisions. The objective of the current study was to determine how library size selection, read length and format, protein database, e-val… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
7
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(9 citation statements)
references
References 56 publications
2
7
0
Order By: Relevance
“…S12). Next, we used a direct mapping approach to annotate remaining sequences against the CAZy and Peroxibase reference databases (100 total gene families; Table S2) using 'sensitive' DIAMOND (v. 0.9.29) 90 and BWA-MEM (v.0.7.17) 91 respectively, following best practice for unmerged reads 92 . The compiled decay gene database primarily contained 'core' gene families found to be actively expressed during fungal decay of SOM and microbial biomass 93,94 (CAZy: http://www.cazy.org; http://peroxibase.toulouse.inra.fr/), and is hereafter referred to as the 'CAZy database' (Table S2).…”
Section: Methodsmentioning
confidence: 99%
“…S12). Next, we used a direct mapping approach to annotate remaining sequences against the CAZy and Peroxibase reference databases (100 total gene families; Table S2) using 'sensitive' DIAMOND (v. 0.9.29) 90 and BWA-MEM (v.0.7.17) 91 respectively, following best practice for unmerged reads 92 . The compiled decay gene database primarily contained 'core' gene families found to be actively expressed during fungal decay of SOM and microbial biomass 93,94 (CAZy: http://www.cazy.org; http://peroxibase.toulouse.inra.fr/), and is hereafter referred to as the 'CAZy database' (Table S2).…”
Section: Methodsmentioning
confidence: 99%
“…On average, 21.7% of sequences per sample were removed during this filtering step, yielding a mean of 307 041 274 putative fungal sequences per sample. Next, we used a direct mapping approach to annotate sequences against the CAZy and Peroxibase reference databases (100 total gene families; Table S1) using the 'sensitive' setting in DIAMOND (v.0.9.29) with an Àe value of 1e À4 (Buchfink et al, 2015) and BWA-MEM (v.0.7.17) with standard settings (Li & Durbin, 2009), following recommendations for unmerged reads (Treiber et al, 2020). The compiled gene database primarily contained previously defined 'core' gene families found to be actively expressed during fungal decay of SOM (Peng et al, 2018;Floudas et al, 2020; CAZy: http://www.cazy.org; http://peroxibase.toulouse.…”
Section: Metagenomic Sequence Generation Processing and Annotationmentioning
confidence: 99%
“…Ectomycorrhizal (ECM) fungi, in particular, dominate boreal and temperate forest soil microbial communities, providing plant hosts with the majority of their annual nitrogen (N) requirements (c. 70%) (Smith & Read, 2010). Despite being one of the most studied microbial groups, understanding of the distribution of ECM fungal traits remain poorly understood (Courty et al, 2016;van der Linde et al, 2018;Meeds et al, 2021). One prominent ECM fungal functional trait is the acquisition of N bound in soil organic matter (N-SOM), which constitutes the majority of soil N (Vitousek & Howarth, 1991).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations