2021
DOI: 10.1186/s12936-021-03788-x
|View full text |Cite
|
Sign up to set email alerts
|

Using deep learning to identify recent positive selection in malaria parasite sequence data

Abstract: Background Malaria, caused by Plasmodium parasites, is a major global public health problem. To assist an understanding of malaria pathogenesis, including drug resistance, there is a need for the timely detection of underlying genetic mutations and their spread. With the increasing use of whole-genome sequencing (WGS) of Plasmodium DNA, the potential of deep learning models to detect loci under recent positive selection, historically signals of drug resistance, was evaluated. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
19
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 20 publications
(19 citation statements)
references
References 40 publications
0
19
0
Order By: Relevance
“…A revised set of SNPs and insertions/deletions (indels) was called with GATK’s HaplotypeCaller (version 4.1.4.1) using the option -ERC GVCF 5 , 22 . Variants were then assigned a quality score using GATK’s Variant Quality Score Recalibration (VQSR), and those with a VQSLOD score < 0, representing variants more likely to be false than true, were filtered out 7 , 22 . Additionally, SNPs were removed if they had more than 10% missing alleles 7 , 22 .The resulting dataset comprised of parasite genomes of P. falciparum (5,957 isolates, 750 k SNPs) and of P. vivax (659 isolates, 588 k SNPs).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…A revised set of SNPs and insertions/deletions (indels) was called with GATK’s HaplotypeCaller (version 4.1.4.1) using the option -ERC GVCF 5 , 22 . Variants were then assigned a quality score using GATK’s Variant Quality Score Recalibration (VQSR), and those with a VQSLOD score < 0, representing variants more likely to be false than true, were filtered out 7 , 22 . Additionally, SNPs were removed if they had more than 10% missing alleles 7 , 22 .The resulting dataset comprised of parasite genomes of P. falciparum (5,957 isolates, 750 k SNPs) and of P. vivax (659 isolates, 588 k SNPs).…”
Section: Methodsmentioning
confidence: 99%
“…Variants were then assigned a quality score using GATK’s Variant Quality Score Recalibration (VQSR), and those with a VQSLOD score < 0, representing variants more likely to be false than true, were filtered out 7 , 22 . Additionally, SNPs were removed if they had more than 10% missing alleles 7 , 22 .The resulting dataset comprised of parasite genomes of P. falciparum (5,957 isolates, 750 k SNPs) and of P. vivax (659 isolates, 588 k SNPs). The population structure was assessed using a principal component analysis (PCA) of between isolate SNP differences.…”
Section: Methodsmentioning
confidence: 99%
“…In [ 16 ], a DL-based approach (named DeepSweep) is conceived to train on the haplotypic image in a genetic region with identified sweeps for identifying loci under positive selection. The DL method detects positive selective signatures from malaria parasite WGS data.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, the best features are selected using the whale optimization method for malaria cell classification with 99.67% accuracy ( 33 ). Malaria cells are classified using deep-sweep software with >0.95 ROC ( 34 ). Features are extracted from transfer learning models which are dense-net-201, dense-net-121, Resnet-101, Resnet-50, VGG-16, and VGG-19 for features extraction and input to SVM, NB, and KNN classifiers for malaria cell classification ( 35 ).…”
Section: Related Workmentioning
confidence: 99%