“…Therefore, we adopt the Natural Language Processing/Speech Technology standard and use token recall and token precision (e.g., Ludusan, Versteegh, Jansen, Gravier, Cao, Johnson & Dupoux, 2014). This is also the approach adopted by previous work that attempts to compare the overall segmentability of different registers (childversus adult-directed speech, Cristia et al, 2019;Ludusan, Mazuka, Bernard, Cristia & Dupoux, 2017), and different languages (Caines, Altmann-Richer & Buttery, 2019;Loukatou, Stoll, Blasi & Cristia, 2018;Loukatou et al, 2019), or simply evaluate proposed algorithms (e.g., Daland & Pierrehumbert, 2011;Goldwater et al, 2009;Phillips & Pearl, 2014). These scores are calculated by comparing the output string, which contains hypothesized word breaks an algorithm supplies, against the original sentence containing word breaks.…”