Hybridization is a key molecular process in biology and biotechnology, but to date there is no predictive model for accurately determining hybridization rate constants based on sequence information. Here we report a weighted neighbor voting (WNV) prediction algorithm, in which the hybridization rate constant of an unknown sequence is predicted based on similarity reactions with known rate constants. To construct this algorithm we first performed 210 fluorescence kinetics experiments to observe the hybridization kinetics of 100 different DNA target and probe pairs (36nt subsequences of the CYCS and VEGF genes) at temperatures ranging from 28°C to 55°C. Automated feature selection and weighting optimization resulted in a final 6-feature WNV model, which can predict hybridization rate constants of new sequences to within a factor of 3 with ≈91% accuracy, based on leave-one-out cross-validation. Accurate prediction of hybridization kinetics allows design of efficient probe sequences for genomics research.
In silico designed nucleic acid probes and primers often fail to achieve favorable specificity and sensitivity tradeoffs on the first try, and iterative empirical sequence-based optimization is needed, particularly in multiplexed assays. Here, we present a novel, on-the-fly method of tuning probe affinity and selectivity via the stoichiometry of auxiliary species, allowing independent and decoupled adjustment of hybridization yield for different probes in multiplexed assays. Using this method, we achieve near-continuous tuning of probe effective free energy (0.03 kcal·mol−1 granularity). As applications, we enforced uniform capture efficiency of 31 DNA molecules (GC content 0% – 100%), maximized signal difference for 11 pairs of single nucleotide variants, and performed tunable hybrid-capture of mRNA from total RNA. Using the Nanostring nCounter platform, we applied stoichiometric tuning to simultaneously adjust yields for a 24-plex assay, and we show multiplexed quantitation of RNA sequences and variants from formalin-fixed, paraffin-embedded samples (FFPE).
Understanding the thermodynamics of DNA motifs is important for prediction and design of probes and primers, but melt curve analyses are low-throughput and produce inaccurate results for motifs such as bulges and mismatches. Here, we developed a new, accurate and high-throughput method for measuring DNA motif thermodynamics called TEEM (Toehold Exchange Energy Measurement). It is a refined framework of comparing two toehold exchange reactions, which are competitive strand displacement between oligonucleotides. In a single experiment, TEEM can measure over 1000 ΔG° values with standard error of roughly 0.05 kcal/mol.
Theoretical and experimental evidence for non-linear hydrogen bonds in protein helices is ubiquitous. In particular, amide three-centered hydrogen bonds are common features of helices in high-resolution crystal structures of proteins. These high-resolution structures (1.0 to 1.5 Å nominal crystallographic resolution) position backbone atoms without significant bias from modeling constraints and identify Φ = -62°, ψ = -43 as the consensus backbone torsional angles of protein helices. These torsional angles preserve the atomic positions of α-β carbons of the classic Pauling α-helix while allowing the amide carbonyls to form bifurcated hydrogen bonds as first suggested by Némethy et al. in 1967. Molecular dynamics simulations of a capped 12-residue oligoalanine in water with AMOEBA (Atomic Multipole Optimized Energetics for Biomolecular Applications), a second-generation force field that includes multipole electrostatics and polarizability, reproduces the experimentally observed high-resolution helical conformation and correctly reorients the amide-bond carbonyls into bifurcated hydrogen bonds. This simple modification of backbone torsional angles reconciles experimental and theoretical views to provide a unified view of amide three-centered hydrogen bonds as crucial components of protein helices. The reason why they have been overlooked by structural biologists depends on the small crankshaft-like changes in orientation of the amide bond that allows maintenance of the overall helical parameters (helix pitch (p) and residues per turn (n)). The Pauling 3.613 α-helix fits the high-resolution experimental data with the minor exception of the amide-carbonyl electron density, but the previously associated backbone torsional angles (Φ, Ψ) needed slight modification to be reconciled with three-atom centered H-bonds and multipole electrostatics. Thus, a new standard helix, the 3.613/10-, Némethy- or N-helix, is proposed. Due to the use of constraints from monopole force fields and assumed secondary structures used in low-resolution refinement of electron density of proteins, such structures in the PDB often show linear hydrogen bonding.
Targeted high-throughput DNA sequencing is a primary approach for genomics and molecular diagnostics, and more recently as a readout for DNA information storage. Oligonucleotide probes used to enrich gene loci of interest have different hybridization kinetics, resulting in non-uniform coverage that increases sequencing costs and decreases sequencing sensitivities. Here, we present a deep learning model (DLM) for predicting Next-Generation Sequencing (NGS) depth from DNA probe sequences. Our DLM includes a bidirectional recurrent neural network that takes as input both DNA nucleotide identities as well as the calculated probability of the nucleotide being unpaired. We apply our DLM to three different NGS panels: a 39,145-plex panel for human single nucleotide polymorphisms (SNP), a 2000-plex panel for human long non-coding RNA (lncRNA), and a 7373-plex panel targeting non-human sequences for DNA information storage. In cross-validation, our DLM predicts sequencing depth to within a factor of 3 with 93% accuracy for the SNP panel, and 99% accuracy for the non-human panel. In independent testing, the DLM predicts the lncRNA panel with 89% accuracy when trained on the SNP panel. The same model is also effective at predicting the measured single-plex kinetic rate constants of DNA hybridization and strand displacement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.