Current forensic DNA analysis predominantly involves identification of human donors by analysis of short tandem repeats (STRs) using Capillary Electrophoresis (CE). Recent developments in Massively Parallel Sequencing (MPS) technologies offer new possibilities in analysis of STRs since they might overcome some of the limitations of CE analysis. In this study 17 STRs and Amelogenin were sequenced in high coverage using a prototype version of the Promega PowerSeq™ system for 297 population samples from the Netherlands, Nepal, Bhutan and Central African Pygmies. In addition, 45 two-person mixtures with different minor contributions down to 1% were analysed to investigate the performance of this system for mixed samples. Regarding fragment length, complete concordance between the MPS and CE-based data was found, marking the reliability of MPS PowerSeq™ system. As expected, MPS presented a broader allele range and higher power of discrimination and exclusion rate. The high coverage sequencing data were used to determine stutter characteristics for all loci and stutter ratios were compared to CE data. The separation of alleles with the same length but exhibiting different stutter ratios lowers the overall variation in stutter ratio and helps in differentiation of stutters from genuine alleles in mixed samples. All alleles of the minor contributors were detected in the sequence reads even for the 1% contributions, but analysis of mixtures below 5% without prior information of the mixture ratio is complicated by PCR and sequencing artefacts.
Massively parallel sequencing (MPS) is on the advent of a broad scale application in forensic research and casework. The improved capabilities to analyse evidentiary traces representing unbalanced mixtures is often mentioned as one of the major advantages of this technique. However, most of the available software packages that analyse forensic short tandem repeat (STR) sequencing data are not well suited for high throughput analysis of such mixed traces. The largest challenge is the presence of stutter artefacts in STR amplifications, which are not readily discerned from minor contributions. FDSTools is an open-source software solution developed for this purpose. The level of stutter formation is influenced by various aspects of the sequence, such as the length of the longest uninterrupted stretch occurring in an STR. When MPS is used, STRs are evaluated as sequence variants that each have particular stutter characteristics which can be precisely determined. FDSTools uses a database of reference samples to determine stutter and other systemic PCR or sequencing artefacts for each individual allele. In addition, stutter models are created for each repeating element in order to predict stutter artefacts for alleles that are not included in the reference set. This information is subsequently used to recognise and compensate for the noise in a sequence profile. The result is a better representation of the true composition of a sample. Using Promega Powerseq™ Auto System data from 450 reference samples and 31 two-person mixtures, we show that the FDSTools correction module decreases stutter ratios above 20% to below 3%. Consequently, much lower levels of contributions in the mixed traces are detected. FDSTools contains modules to visualise the data in an interactive format allowing users to filter data with their own preferred thresholds.
Gene variant databases are the backbone of DNA-based diagnostics. These databases, also called Locus-Specific DataBases (LSDBs), store information on variants in the human genome and the observed phenotypic consequences. The largest collection of public databases uses the free, open-source LOVD software platform. To cope with the current demand for online databases, we have entirely redesigned the LOVD software. LOVD3 is genome-centered and can be used to store summary variant data, as well as full case-level data with information on individuals, phenotypes, screenings, and variants. While built on a standard core, the software is highly flexible and allows personalization to cope with the largely different demands of gene/disease database curators. LOVD3 follows current standards and includes tools to check variant descriptions, generate HTML files of reference sequences, predict the consequences of exon deletions/duplications on the reading frame, and link to genomic views in the different genomes browsers. It includes APIs to collect and submit data. The software is used by about 100 databases, of which 56 public LOVD instances are registered on our website and together contain 1,000,000,000 variant observations in 1,500,000 individuals. 42 LOVD instances share data with the federated LOVD data network containing 3,000,000 unique variants in 23,000 genes. This network can be queried directly, quickly identifying LOVD instances containing relevant information on a searched variant.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.