“…Since the release of PURC, several studies have demonstrated shortcomings of OTU clustering, which is the method PURC used to infer biological sequences. In particular, OTU clustering tends to overestimate the number of sequences, is sometime unreproducible (repeat runs yield different OTUs), and it is difficult to determine appropriate similarity thresholds (Callahan et al, 2017;Barnes et al, 2020;Joos et al, 2020;Nelson et al, 2020), with the overestimation problem reported for PURC specifically (Morales-Briones and Tank, 2019;Blischak et al, 2018). An alternative approach to identify and separate PCR and sequencing errors from biological sequences is to apply an error model, where read abundance, composition, and quality scores are used to infer whether each unique read is likely to have been derived from another observed sequence (Callahan et al, 2016).…”