32Oncogene amplification, a major driver of cancer pathogenicity, is often mediated through 33 focal amplification of genomic segments. Recent results implicate extrachromosomal 34 DNA (ecDNA) as the primary mechanism driving focal copy number amplification (fCNA) 35 -enabling gene amplification, rapid tumor evolution, and the rewiring of regulatory 36 circuitry. Resolving an fCNA's structure is a first step in deciphering the mechanisms of 37 its genesis and the subsequent biological consequences. Here, we introduce a powerful 38 new computational method, AmpliconReconstructor (AR), for integrating optical mapping 39 (OM) of long DNA fragments (>150kb) with next-generation sequencing (NGS) to resolve 40 fCNAs at single-nucleotide resolution. AR uses an NGS-derived breakpoint graph 41 alongside OM scaffolds to produce high-fidelity reconstructions. After validating 42 performance by extensive simulations, we used AR to reconstruct fCNAs in seven cancer 43 cell lines to reveal the complex architecture of ecDNA, breakage-fusion-bridge cycles, 44 and other complex rearrangements. By distinguishing between chromosomal and 45 extrachromosomal origins, and by reconstructing the rearrangement signatures 46 associated with a given fCNA's generative mechanism, AR enables a more thorough 47 understanding of the origins of fCNAs, and their functional consequences. 48 49 Main: 50 Oncogene amplification is a major driver of cancer pathogenicity 1-5 . Genomic signatures 51 of oncogene amplification include somatic focal Copy Number Amplifications (fCNAs) of 52 small (typically < 10Mbp) genomic regions 5,6 . Multiple mechanisms cause fCNAs 53including, but not limited to, extrachromosomal DNA (ecDNA) formation 5,7,8 , 54 chromothripsis 9 , tandem duplications 10,11 and breakage-fusion-bridge (BFB) cycles 12-14 . 55 ecDNA, in particular, enables tumors to achieve far higher oncogene genomic copy 56 numbers and maintain far greater levels of intratumor genetic heterogeneity than 57 previously anticipated, due to their non-chromosomal mechanism of inheritance -58 enabling tumors to evolve rapidly 5,15,16 . In addition, the very high DNA template level 59 generated by ecDNA-based amplification, coupled to its highly accessible chromatin 60 architecture, permits massive oncogene transcription [17][18][19] . 61 62 3While ecDNA elements are a common form of fCNA 5 , other mechanisms can also result 63 in amplification with very different functional consequences 6 . Thus, accurate identification 64 and reconstruction of the fCNA structure not only describes the rearranged genomic 65 landscape, but also represents a first step in identifying the generative mechanism.
66Reconstruction of fCNA architecture involves determining the order and orientation of the 67 genomic segments that constitute the amplicon. There are many methods to detect single 68 genomic breakpoints from sequencing data using a variety of different sequencing 69 technologies 20-23 . However, fewer methods are available to handle the more difficult 70 problem of...