Structural variants (SVs) are a major source of genetic and phenotypic variation, but remain challenging to accurately type and are hence poorly characterized in most species. We present an approach for reliable SV discovery in non-model species using whole genome sequencing and report 15,483 high-confidence SVs in 492 Atlantic salmon (Salmo salar L.) sampled from a broad phylogeographic distribution. These SVs recover population genetic structure with high resolution, include an active DNA transposon, widely affect functional features, and overlap more duplicated genes retained from an ancestral salmonid autotetraploidization event than expected. Changes in SV allele frequency between wild and farmed fish indicate polygenic selection on behavioural traits during domestication, targeting brain-expressed synaptic networks linked to neurological disorders in humans. This study offers novel insights into the role of SVs in genome evolution and the genetic architecture of domestication traits, along with resources supporting reliable SV discovery in non-model species.
1Structural variants (SVs) are a major source of genetic and phenotypic variation, but remain challenging to 2 accurately type and are hence poorly characterized in most species. We present an approach for reliable SV 3 discovery in non-model species using whole genome sequencing and report 15,483 high-confidence SVs in 4 492 Atlantic salmon (Salmo salar L.) sampled from a broad phylogeographic distribution. These SVs 5 recover population genetic structure with high resolution, include an active DNA transposon, widely affect 6 functional features, and overlap more duplicated genes retained from an ancestral salmonid 7 autotetraploidization event than expected. Changes in SV allele frequency between wild and farmed fish 8 indicate polygenic selection on behavioural traits during domestication, targeting brain-expressed synaptic 9 networks linked to neurological disorders in humans. This study offers novel insights into the role of SVs 10 in genome evolution and the genetic architecture of domestication traits, along with resources supporting 11 reliable SV discovery in non-model species. 12Main 13Modern genetics remains primarily focused on single nucleotide polymorphism (SNP) analyses, with a 14 growing recognition of the importance of larger structural variants (SVs) including inversions, insertions, 15 deletions and copy number variations (CNVs) (defined here as variants ≥100 bp), among others 1 . SVs 16 affect a larger proportion of bases in human genomes than SNPs 4 , are not always reliably tagged by SNPs 5 , 17 more frequently have regulatory impacts 6 , and have been shown to alter the structure, presence, number, 18 dosage, and regulation of many genes 1 . Nonetheless, SVs remain challenging to accurately type using 19 whole genome sequence data 2-3 , limiting our understanding of their biological roles and exploitation as 20 genetic markers. Consequently, there is a need for reliable SV detection approaches to fully exploit the fast-21 accumulating genome sequencing datasets in both model and non-model species, allowing for more 22 complete genetics investigations. Many tools exist for SV discovery using short-read sequencing data, but 23 all suffer from high false discovery rates (10-89%) 2,3,7 . This poses a challenge for truly de novo SV 24 detection in previously unstudied species lacking 'gold standard' reference SVs to help distinguish true 25 from false calls. Most studies rely on combining an ensemble of signals from different SV detection 26 methods, although this strategy does not reliably improve performance and can in some cases aggravate 27 false discovery 3 . Researchers therefore often apply independent experimental 8-9 or visualization methods 10 28 to validate a subset of SV calls. Overall, there remains an unsatisfactory lack of consensus on how to 29 validate the quality of de novo SV datasets in most species 3 . 31Salmonids have the highest combined economic, ecological and scientific importance among all fish 32 lineages, and have consequently been subject to hundreds of genetics stu...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.