Background: Depth and evenness of sequencing coverage are considered important indicators of assembly quality. In plastid genomics, where new data generation has outstripped the development of suitable assembly quality indicators, these coverage metrics could provide valuable insights into the quality of plastomes of different size, structure, and evolutionary origin. However, the typical variation of sequencing depth and evenness among archived plastid genomes, the differences of these metrics among individual plastome sections, and any association with methodological factors have yet to be evaluated.
Methods: This investigation aims to explore the variation of sequencing depth and sequencing evenness across publicly accessible plastid genomes as well as the potential associations that these metrics have with plastome structure, assembly quality, and methodological provenance. Specifically, we assess if sequencing evenness and reduced sequencing depth have significant correlations with, or significant differences among, individual genome sections, assembly quality indicators, the sequencing platforms employed, and the assembly software tools used. To that end, we retrieve published plastid genomes, their sequence reads, and their genome metadata from public sequence databases, measure sequencing depth and evenness across genome partitions and full-length genomes, and test several hypotheses regarding genome structure, assembly quality, and methodological provenance through non-parametric statistical tests.
Results: The results of our analyses indicate significant differences in sequencing depth across the four structural partitions as well as between the coding and non-coding sections of the plastid genomes, a significant correlation between sequencing evenness and the number of ambiguous nucleotides per genome, and significant differences in sequencing evenness across various sequencing platforms.
Conclusions: Based on these results, we conclude that sequencing depth may exhibit a variation that is associated with the quadripartite structure and the assembly quality of plastid genomes, and that sequencing evenness in plastid genomes is likely influenced by the type of sequencing platform employed.