2018
DOI: 10.1186/s12859-018-2333-9
|View full text |Cite
|
Sign up to set email alerts
|

Fast estimation of genetic relatedness between members of heterogeneous populations of closely related genomic variants

Abstract: BackgroundMany biological analysis tasks require extraction of families of genetically similar sequences from large datasets produced by Next-generation Sequencing (NGS). Such tasks include detection of viral transmissions by analysis of all genetically close pairs of sequences from viral datasets sampled from infected individuals or studying of evolution of viruses or immune repertoires by analysis of network of intra-host viral variants or antibody clonotypes formed by genetically close sequences. The most o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 20 publications
0
4
0
Order By: Relevance
“…Viral diversity was measured by (i) the mutation frequency (mutant clones divided by the total number of clones analyzed) (ii) the Shannon entropy [39][40][41], (iii) the Simpson index of diversity (1-D) [42] and (iv) the Hamming distances [43][44][45]. Shannon entropy of each brain was calculated using the following formula [40]:…”
Section: Measurement Of Absolute Diversitymentioning
confidence: 99%
See 1 more Smart Citation
“…Viral diversity was measured by (i) the mutation frequency (mutant clones divided by the total number of clones analyzed) (ii) the Shannon entropy [39][40][41], (iii) the Simpson index of diversity (1-D) [42] and (iv) the Hamming distances [43][44][45]. Shannon entropy of each brain was calculated using the following formula [40]:…”
Section: Measurement Of Absolute Diversitymentioning
confidence: 99%
“…The Hamming distance [43][44][45] ( Figure 3A,B) measures the number of nucleotide substitutions in viral genomes and group variants based on the number of substitutions when compared with the reference sequence (GenBank X03700). Clones with the same number of mutations are grouped together.…”
Section: Impact Of the Yfv-17d Inoculum Size On Evolution Of The Virumentioning
confidence: 99%
“…Despite the simplicity of the metric, its calculation is challenging for extremely large NGS datasets, since its naive implementation requires a pairwise comparison of sequences from all pairs of patients. To address this challenge, several filtering techniques have been proposed [149,150]. In consecutive studies [43,44,131,151], more sophisticated distance measures for quasispecies populations have been proposed.…”
Section: Outbreak Investigationmentioning
confidence: 99%
“…Both systems can work with haplotypes obtained from NGS data and are scalable for extremely large datasets produced by Illumina MiSeq and other sequencing platforms. In particular, GHOST employs several efficient k-merbased filtering techniques for viral sequence similarity queries, which allow for the elimination of an exhaustive comparison of all pairs of viral haplotypes and allow processing of NGS data from a given HCV outbreak in minutes [150].…”
Section: Molecular Surveillance Systems and Databasesmentioning
confidence: 99%