Recombination creates mosaic genomes containing regions with mixed ancestry, and the accumulation of such events over time can complicate greatly many aspects of evolutionary inference. Here, we developed a sliding window bootstrap (SWB) method to generate genomic bootstrap (GB) barcodes to highlight the regions supporting phylogenetic relationships. The method was applied to an alignment of 56 sarbecoviruses, including SARS-CoV and SARS-CoV-2, responsible for the SARS epidemic and COVID-19 pandemic, respectively. The SWB analyses were also used to construct a consensus tree showing the most reliable relationships and better interpret hidden phylogenetic signals. Our results revealed that most relationships were supported by just a few genomic regions and confirmed that three divergent lineages could be found in bats from Yunnan: SCoVrC, which groups SARS-CoV related coronaviruses from China; SCoV2rC, which includes SARS-CoV-2 related coronaviruses from Southeast Asia and Yunnan; and YunSar, which contains a few highly divergent viruses recently described in Yunnan. The GB barcodes showed evidence for ancient recombination between SCoV2rC and YunSar genomes, as well as more recent recombination events between SCoVrC and SCoV2rC genomes. The recombination and phylogeographic patterns suggest a strong host-dependent selection of the viral RNA-dependent RNA polymerase. In addition, SARS-CoV-2 appears as a mosaic genome composed of regions sharing recent ancestry with three bat SCoV2rCs from Yunnan (RmYN02, RpYN06, and RaTG13) or related to more ancient ancestors in bats from Yunnan and Southeast Asia. Finally, our results suggest that viral circular RNAs may be key molecules for the mechanism of recombination.
Phylogenetic trees of coronaviruses are difficult to interpret because they undergo frequent genomic recombination. Here, we propose a new method, coloured genomic bootstrap (CGB) barcodes, to highlight the polyphyletic origins of human sarbecoviruses and understand their host and geographic origins. The results indicate that SARS-CoV and SARS-CoV-2 contain genomic regions of mixed ancestry originating from horseshoe bat (Rhinolophus) viruses. First, different regions of SARS-CoV share exclusive ancestry with five Rhinolophus viruses from Southwest China (RfYNLF/31C: 17.9%; RpF46: 3.3%; RspSC2018: 2.0%; Rpe3: 1.3%; RaLYRa11: 1.0%) and 97% of its genome can be related to bat viruses from Yunnan (China), supporting its emergence in the Rhinolophus species of this province. Second, different regions of SARS-CoV-2 share exclusive ancestry with eight Rhinolophus viruses from Yunnan (RpYN06: 5.8%; RaTG13: 4.8%; RmYN02: 3.8%), Laos (RpBANAL103: 3.3%; RmarBANAL236: 1.7%; RmBANAL52: 1.0%; RmBANAL247: 0.7%), and Cambodia (RshSTT200: 2.3%), and 98% of its genome can be related to bat viruses from northern Laos and Yunnan, supporting its emergence in the Rhinolophus species of this region. Although CGB barcodes are very useful in retracing the origins of human sarbecoviruses, further investigations are needed to better take into account the diversity of coronaviruses in bats from Cambodia, Laos, Myanmar, Thailand and Vietnam.
Phylogenetic trees of coronaviruses are difficult to interpret because they undergo frequent ge-nomic recombination. Here, we propose a new method, named coloured genomic bootstrap (CGB) barcodes, to highlight the polyphyletic origins of human sarbecoviruses and understand their host and geographic ori-gins. The results indicate that SARS-CoV and SARS-CoV-2 contain genomic regions of mixed an-cestry originating from horseshoe bat (Rhinolophus) viruses. First, different regions of SARS-CoV share exclusive ancestry with five Rhinolophus viruses from Southwest China (RfYNLF/31C: 17.9%; RpF46: 3.3%; RspSC2018: 2.0%; Rpe3: 1.3%; RaLYRa11: 1.0%) and 97% of its genome can be related to bat viruses from Yunnan (China), supporting its emergence in Rhinolophus species of this province. Second, different regions of SARS-Cov-2 share exclusive ancestry with eight Rhi-nolophus viruses from Yunnan (RpYN06: 5.8%; RaTG13: 4.8%; RmYN02: 3.8%), Laos (RpBA-NAL103: 3.3%; RmarBANAL236: 1.7%; RmBANAL52: 1.0%; RmBANAL247: 0.7%), and Cam-bodia (RshSTT200: 2.3%), and 98% of its genome can be related to bat viruses from northern Laos and Yunnan, supporting its emergence in Rhinolophus species of this region. Although CGB barcodes are very useful to retrace the origins of human sarbecoviruses, further investigations are needed to better apprehend the diversity of coronaviruses in bats from Cambo-dia, Laos, Myanmar, Thailand and Vietnam.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.