Construction of a strawberry breeding core collection to capture and exploit genetic variation

Koorevaar, Tim; Willemsen, Johan H.; Visser, R.G.F.; Arens, Paul; Maliepaard, Chris

doi:10.21203/rs.3.rs-2591434/v1

Cited by 1 publication

(1 citation statement)

References 33 publications

(55 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This could be the case for chr_7B because a major gene for resistance to Phytophtora cactorum (FaRPc-2) is located on this chromosome [23]. To mitigate this, a larger diversity panel could be utilized, focusing on adding rare genetic variation instead of genetic variation already represented well in the original panel, for example by constructing a breeding core collection [24].…”

Section: Ld Decay Plots Illustrate Why Ld-based Ltering Workmentioning

confidence: 99%

How to handle high subgenome sequence similarity in allopolyploid Fragaria x ananassa: Linkage Disequilibrium Based Variant Filtering

Koorevaar,

Willemsen,

Hildebrand

et al. 2024

Preprint

View full text Add to dashboard Cite

Background The allo-octoploid F. x ananassa consistently follows a disomic inheritance. Therefore diploid variant calling pipelines can be followed but due to the high similarity among its subgenomes, there is an increased error rate for these variants. Especially when aligning short sequencing reads (150bp) to a reference genome, reads could be aligned on the wrong subgenome, resulting in erroneous variants. It is important to know which subgenome is important for a desired phenotypic value of a particular trait and filtering out these erroneous variants decreases the chance that a wrong subgenome is traced for certain traits. To mitigate the problem, we first need to classify variants in different categories: correct variants (type 1), and two erroneous variant types: homoeologous variants (type 2), and multi-locus variants (type 3). Results Erroneous variant types (type 2 and 3) often have skewed average allele balances (of heterozygous calls), but not always. So, the average allele balance of heterozygous variants is not sufficient to tag all erroneous variants in F. x ananassa. Not identified erroneous variants were further checked by an LD-based method in a diversity panel. This method predicted variant types with 99% similarity to a method utilizing a genetic map from a biparental mapping population that was used for validation of the method. The effect of the filtering methods on phasing accuracy was assessed by using SHAPEIT5 for phasing, and the lowest switch error rate (0.037) was obtained by a combination of LD-based and average allele balance filtering although the addition of the latter only improved the switch error rate slightly. This indicates that the LD-based method tags most erroneous variants with a skewed average allele balance and also other erroneous variants. The dataset resulting from the best filtering method (LD-based + AAB) had a 44% lower switch error rate than the original dataset and retained 72% of the overall variants. Conclusions In conclusion, erroneous variants that arise from high sequence similarity in allopolyploids could be identified without the need for genotyping many mapping populations. This LD-based filtering method improved phasing accuracy and ensures that important alleles are better traceable through the germplasm.

show abstract

Section: Ld Decay Plots Illustrate Why Ld-based Ltering Workmentioning

confidence: 99%