Despite developments in targeted gene sequencing and whole-genome analysis techniques, the robust detection of all genetic variation, including structural variants, in and around genes of interest and in an allele-specific manner remains a challenge. Here we present targeted locus amplification (TLA), a strategy to selectively amplify and sequence entire genes on the basis of the crosslinking of physically proximal sequences. We show that, unlike other targeted re-sequencing methods, TLA works without detailed prior locus information, as one or a few primer pairs are sufficient for sequencing tens to hundreds of kilobases of surrounding DNA. This enables robust detection of single nucleotide variants, structural variants and gene fusions in clinically relevant genes, including BRCA1 and BRCA2, and enables haplotyping. We show that TLA can also be used to uncover insertion sites and sequences of integrated transgenes and viruses. TLA therefore promises to be a useful method in genetic research and diagnostics when comprehensive or allele-specific genetic information is needed.
Transgenesis has been a mainstay of mouse genetics for over 30 yr, providing numerous models of human disease and critical genetic tools in widespread use today. Generated through the random integration of DNA fragments into the host genome, transgenesis can lead to insertional mutagenesis if a coding gene or an essential element is disrupted, and there is evidence that larger scale structural variation can accompany the integration. The insertion sites of only a tiny fraction of the thousands of transgenic lines in existence have been discovered and reported, due in part to limitations in the discovery tools. Targeted locus amplification (TLA) provides a robust and efficient means to identify both the insertion site and content of transgenes through deep sequencing of genomic loci linked to specific known transgene cassettes. Here, we report the first large-scale analysis of transgene insertion sites from 40 highly used transgenic mouse lines. We show that the transgenes disrupt the coding sequence of endogenous genes in half of the lines, frequently involving large deletions and/or structural variations at the insertion site. Furthermore, we identify a number of unexpected sequences in some of the transgenes, including undocumented cassettes and contaminating DNA fragments. We demonstrate that these transgene insertions can have phenotypic consequences, which could confound certain experiments, emphasizing the need for careful attention to control strategies. Together, these data show that transgenic alleles display a high rate of potentially confounding genetic events and highlight the need for careful characterization of each line to assure interpretable and reproducible experiments.
1Transgenesis has been a mainstay of mouse genetics for over 30 years, providing numerous models of 2 human disease and critical genetic tools in widespread use today. Generated through the random 3 integration of DNA fragments into the host genome, transgenesis can lead to insertional mutagenesis if a 4 coding gene or essential element is disrupted, and there is evidence that larger scale structural variation 5 can accompany the integration. The insertion sites of only a tiny fraction of the thousands of transgenic 6 lines in existence have been discovered and reported due in part to limitations in the discovery tools.
Cre/LoxP technology is widely used in the field of mouse genetics for spatial and/or temporal regulation of gene function. For Cre lines generated via pronuclear microinjection of a Cre transgene construct, the integration site is random and in most cases not known. Integration of a transgene can disrupt an endogenous gene, potentially interfering with interpretation of the phenotype. In addition, knowledge of where the transgene is integrated is important for planning of crosses between animals carrying a conditional allele and a given Cre allele in case the alleles are on the same chromosome. We have used targeted locus amplification (TLA) to efficiently map the transgene location in seven previously published Cre and CreERT2 transgenic lines. In all lines, transgene insertion was associated with structural changes of variable complexity, illustrating the importance of testing for rearrangements around the integration site. In all seven lines the exact integration site and breakpoint sequences were identified. Our methods, data and genotyping assays can be used as a resource for the mouse community and our results illustrate the power of the TLA method to not only efficiently map the integration site of any transgene, but also provide additional information regarding the transgene integration events.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.