BWA-MEME: BWA-MEM emulated with a machine learning approach

Jung, Youngmok; Han, Dongsu

doi:10.1101/2021.09.01.457579

Cited by 10 publications

(2 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…20 Following trimming, paired-end reads were mapped to the hg38 human reference genome using BWA-MEM2 (parameters: -K 1000000000 -M, BWA-MEM2 reference). 21 Germline variants were called using DeepVariant, a deep learning-based variant caller. 22 We used VEP 23 and Slivar 24 to tag variants associated with different genes and to remove common variants reported in gnomAD.…”

Section: Exome Sequencingmentioning

confidence: 99%

Genetic Risk Factors for Early-Onset Merkel Cell Carcinoma

Mohsin,

Hunt,

Yan

et al. 2024

JAMA Dermatol

View full text Add to dashboard Cite

ImportanceMerkel cell carcinoma (MCC) is a rare, aggressive neuroendocrine skin cancer. Of the patients who develop MCC annually, only 4% are younger than 50 years.ObjectiveTo identify genetic risk factors for early-onset MCC via genomic sequencing.Design, Setting, and ParticipantsThe study represents a multicenter collaboration between the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS), the National Institute of Allergy and Infectious Diseases (NIAID), and the University of Washington. Participants with early-onset and later-onset MCC were prospectively enrolled in an institutional review board–approved study at the University of Washington between January 2003 and May 2019. Unrelated controls were enrolled in the NIAID Centralized Sequencing Program (CSP) between September 2017 and September 2021. Analysis was performed from September 2021 and March 2023. Early-onset MCC was defined as disease occurrence in individuals younger than 50 years. Later-onset MCC was defined as disease occurrence at age 50 years or older. Unrelated controls were evaluated by the NIAID CSP for reasons other than familial cancer syndromes, including immunological, neurological, and psychiatric disorders.ResultsThis case-control analysis included 1012 participants: 37 with early-onset MCC, 45 with later-onset MCC, and 930 unrelated controls. Among 37 patients with early-onset MCC, 7 (19%) had well-described variants in genes associated with cancer predisposition. Six patients had variants associated with hereditary cancer syndromes (ATM = 2, BRCA1 = 2, BRCA2 = 1, and TP53 = 1) and 1 patient had a variant associated with immunodeficiency and lymphoma (MAGT1). Compared with 930 unrelated controls, the early-onset MCC cohort was significantly enriched for cancer-predisposing pathogenic or likely pathogenic variants in these 5 genes (odds ratio, 30.35; 95% CI, 8.89-106.30; P &lt; .001). No germline disease variants in these genes were identified in 45 patients with later-onset MCC. Additional variants in DNA repair genes were also identified among patients with MCC.Conclusions and RelevanceBecause variants in certain DNA repair and cancer predisposition genes are associated with early-onset MCC, genetic counseling and testing should be considered for patients presenting at younger than 50 years.

show abstract

Section: Exome Sequencingmentioning

confidence: 99%

Genetic Risk Factors for Early-Onset Merkel Cell Carcinoma

Mohsin,

Hunt,

Yan

et al. 2024

JAMA Dermatol

View full text Add to dashboard Cite

show abstract

“…Firstly, The variant calling process was conducted by SAMtools 'mpileup' command, and the single nucleotide polymorphisms (SNPs) were identified by BCFtools 'call' command [31]. Secondly, to validate the SNPs, we also employed Genome Analysis Toolkit (GATK, v4.0) to detect SNPs [30], we mapped the clean reads to the reference using BWA-MEM with default parameters [53]; the multiple tools ('Mark-Duplicates' , 'HaplotypeCaller' and 'VariantFiltration' , etc.,) implemented in GATK [30] were used to obtain high-quality SNPs, with strict filter settings "QD < 2.0 || MQ < 40.0 || FS > 60.0 || SOR > 3.0 || MQRank-Sum < -12.5 || 218 ReadPosRankSum < -8.0". For chloroplast, based on their SNP-calling results and gene annotation files, RNA editing sites were identified by using the REDO tool [54].…”

Section: Identification Of Rna Editing Sitesmentioning

confidence: 99%

Genome-wide identification and expression analysis of peach multiple organellar RNA editing factors reveals the roles of RNA editing in plant immunity

et al. 2022

View full text Add to dashboard Cite

Background Multiple organellar RNA editing factor (MORF) genes play key roles in chloroplast developmental processes by mediating RNA editing of Cytosine-to-Uracil conversion. However, the function of MORF genes in peach (Prunus persica), a perennial horticultural crop species of Rosaceae, is still not well known, particularly the resistance to biotic and abiotic stresses that threaten peach yield seriously. Results In this study, to reveal the regulatory roles of RNA editing in plant immunity, we implemented genome-wide analysis of peach MORF (PpMORF) genes in response to biotic and abiotic stresses. The chromosomal and subcellular location analysis showed that the identified seven PpMORF genes distributed on three peach chromosomes were mainly localized in the mitochondria and chloroplast. All the PpMORF genes were classified into six groups and one pair of PpMORF genes was tandemly duplicated. Based on the meta-analysis of two types of public RNA-seq data under different treatments (biotic and abiotic stresses), we observed down-regulated expression of PpMORF genes and reduced chloroplast RNA editing, especially the different response of PpMORF2 and PpMORF9 to pathogens infection between resistant and susceptible peach varieties, indicating the roles of MORF genes in stress response by modulating the RNA editing extent in plant immunity. Three upstream transcription factors (MYB3R-1, ZAT10, HSFB3) were identified under both stresses, they may regulate resistance adaption by modulating the PpMORF gene expression. Conclusion These results provided the foundation for further analyses of the functions of MORF genes, in particular the roles of RNA editing in plant immunity. In addition, our findings will be conducive to clarifying the resistance mechanisms in peaches and open up avenues for breeding new cultivars with high resistance.

show abstract

Genetic polymorphism and evidence of signatures of selection in thePlasmodium falciparumcircumsporozoite protein gene in Tanzanian regions with different malaria endemicity

Lyimo,

Bakari,

Popkin-Hall

et al. 2024

Preprint

View full text Add to dashboard Cite

BackgroundIn 2021 and 2023, the World Health Organization approved RTS,S/AS01 and R21/Matrix M malaria vaccines, respectively, for routine immunization of children in African countries with moderate to high transmission. These vaccines are made ofPlasmodium falciparumcircumsporozoite protein (Pfcsp)but polymorphisms in this gene raises concerns regarding strain-specific responses and the long-term efficacy of these vaccines. This study assessed thePfcspgenetic diversity, population structure and signatures of selection among parasites from areas of different malaria transmission in mainland Tanzania, to generate baseline data before the introduction of the malaria vaccines in the country.MethodsThe analysis involved 589 whole genome sequences generated by and as part of the MalariaGEN Community Project. The samples were collected between 2013 and January 2015 from five regions of mainland Tanzania: Morogoro and Tanga (Muheza) (moderate transmission areas), and Kagera (Muleba), Lindi (Nachingwea), and Kigoma (Ujiji) (high transmission areas). Wright’s inbreeding coefficient (Fws), Wright’s fixation index (FST), principal component analysis, nucleotide diversity, and Tajima’s D were used to assess within-host parasite diversity, population structure and natural selection.ResultsBased on Fws(< 0.95), there was high polyclonality (ranged from 69.23% in Nachingwea to 56.9% in Muheza). No population structure was detected in thePfcspgene in the five regions (mean FST= 0.0068). The average nucleotide diversity (π), nucleotide differentiation (K) and haplotype diversity (Hd) in the five regions were 4.19, 0.973 and 0.0035, respectively. The C-terminal region ofPfcspshowed high nucleotide diversity at Th2R and Th3R regions. Positive values for the Tajima’s D were observed in the Th2R and Th3R regions consistent with balancing selection. ThePfcspC-terminal sequences had 50 different haplotypes (H_1 to H_50) and only 2% of sequences matched the 3D7 strain haplotype (H_50).ConclusionsThe findings demonstrate high diversity of thePfcspgene with limited population differentiation. ThePfcspgene showed positive Tajima’s D values for parasite populations, consistent with balancing selection for variants within Th2R and Th3R regions. This data is consistent with other studies conducted across Africa and worldwide, which demonstrate low 3D7 haplotypes and little population structure. Therefore, additional research is warranted, incorporating other regions and more recent data to comprehensively assess trends in genetic diversity within this important gene. Such insights will inform the choice of alleles to be included in the future vaccines

show abstract

BWA-MEME: BWA-MEM emulated with a machine learning approach

Cited by 10 publications

References 24 publications

Genetic Risk Factors for Early-Onset Merkel Cell Carcinoma

Genetic Risk Factors for Early-Onset Merkel Cell Carcinoma

Genome-wide identification and expression analysis of peach multiple organellar RNA editing factors reveals the roles of RNA editing in plant immunity

Genetic polymorphism and evidence of signatures of selection in thePlasmodium falciparumcircumsporozoite protein gene in Tanzanian regions with different malaria endemicity

Contact Info

Product

Resources

About