2016
DOI: 10.26226/morressier.5731f0d5d462b8029237fa96
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes

Abstract: Bacterial genomes vary extensively in terms of both gene content and gene sequence. This plasticity hampers the use of traditional SNP-based methods for identifying all genetic associations with phenotypic variation. Here we introduce a computationally scalable and widely applicable statistical method (SEER) for the identification of sequence elements that are significantly enriched in a phenotype of interest. SEER is applicable to tens of thousands of genomes by counting variable-length k-mers using a distrib… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
69
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 44 publications
(69 citation statements)
references
References 29 publications
0
69
0
Order By: Relevance
“…To calculate this association while also controlling for genetic background, we also used an LMM with the genetic kinship between isolates as random effects, as in genome-wide association studies (77,78). We used the pyseer package (version 1.2.0) in LMM mode, with the kinship/covariance matrix calculated from a neighbor-joining tree of all genome sequences in the cohort (53,79).…”
Section: Methodsmentioning
confidence: 99%
“…To calculate this association while also controlling for genetic background, we also used an LMM with the genetic kinship between isolates as random effects, as in genome-wide association studies (77,78). We used the pyseer package (version 1.2.0) in LMM mode, with the kinship/covariance matrix calculated from a neighbor-joining tree of all genome sequences in the cohort (53,79).…”
Section: Methodsmentioning
confidence: 99%
“…Using Scoary, single genes that had association with the characteristic "genus" were found, with low In addition to the methods using gene presence/absence and k-mers that were used in our study, other types of genetic variants can be used as input for microbial GWAS (31). The k-mer approach used in this study is able to detect different genetic variants such as SNPs, indels, variable promotor regions and gene content simultaneously (32). This indicates that adding purely SNP-based methods to the used methods is redundant as SNPs are already encompassed in the performed k-mer method.…”
Section: Discussionmentioning
confidence: 95%
“…We counted variable length k-mers with a minor allele count of at least ten using fsm-lite 18 . In the Dutch data, there were 11.7M informative k-mers, with 2.6M unique patterns.…”
Section: Methodsmentioning
confidence: 99%
“…Pathogen GWASs (pGWASs) provide a way to identify pneumococcal sequence variation associated with meningitis, independent of genetic background and in an unbiased manner. While GWAS is more challenging in bacteria than in humans due to strong population structure and high levels of pan-genomic variation, recent methodological advances have helped overcome these issues [18][19][20] .…”
mentioning
confidence: 99%