“…In the most straight-forward classification scheme, reads that match a specific target genome with sufficient similarity are classified as endogenous (that is, from the target species) [11,13]. A simple extension of this method considers whether better alignments to other sequence databases exist, and use these to exclude potential microbial or other contaminants [9,10,17]. Divergence can then be calculated in a pairwise manner from the average similarity of all alignments for the sequences deemed to be endogenous [11,13,17].…”