A curated library of circular dichroism spectra of 23 G-quadruplexes of known structure was built and analyzed. The goal of this study was to use this reference library to develop an algorithm to derive quantitative estimates of the secondary structure content of quadruplexes from their experimental CD spectra. Principle component analysis and singular value decomposition were used to characterize the reference spectral library. CD spectra were successfully fit to obtain estimates of the amounts of base steps in anti-anti, syn-anti or anti-syn conformations, in diagonal or lateral loops or in other conformations. The results show that CD spectra of nucleic acids can be analyzed to obtain quantitative structural information about secondary structure content in an analogous way to methods used to analyze protein CD spectra.peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission.The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/184390 doi: bioRxiv preprint first posted online Sep. 4, 2017; 2 Circular dichroism (CD) spectroscopy is a primary tool for the characterization of Gquadruplex (G4) structures. G-quadruplexes are functionally important genomic elements that form at specific locations in an orchestrated manner throughout the cell cycle. [1] Different G4 structures, arising from differences in G-quartet stacking, strand segment orientation and loop arrangements, display unique CD spectral signatures. [2] Qualitative rules-of-thumb have evolved that associate CD spectral features with particular G4 topologies, namely parallel (≈264 nm max, ≈245 nm min), antiparallel (≈295 max, ≈260 min) or "hybrid" (or 3+1) (≈295 max, ≈260 max, ≈245 min). [3] [4] [5] [6] [7] While some exceptions to these rules have been noted, they are generally accepted for the characterization and validation of quadruplex formation in potential quadruplex forming sequences.CD spectroscopy is widely used for the quantitative determination of the secondary structural content of proteins. Over several decades, reference libraries assembled for this purpose have grown in content and a number of analytical algorithms have evolved that now make the quantitative analysis of protein spectra by CD fairly routine. Such is not the case for nucleic acids. For duplex DNA, CD is used primarily in a qualitative way to distinguish common secondary structures (e.g., B-, A-and Z-forms) and is particularly valuable for monitoring changes in secondary structure in titration, binding or thermal denaturation experiments. [2] A recent chemometric analysis of nucleic acid CD spectra was used to classify nucleic acid structures using a library of sequences and structures that expanded the range of topologies to include multistranded triplex and quadruplex forms. [8] As of yet no quantitative analysis has been attempted for nucleic acids that is analogous to the approach used to quantify CD spectra of proteins to obtain more detailed structural information. The goal of this study is to...