“…Our binary feature representation largely overlaps with that used in PanPhon [12], and differs from the multi-valued features used in [10], which map more directly to IPA categories such as vowel frontness or consonant place. While our feature set gives a more compact representation, with 24 features vs. 60 in [10] (after conversion to binary vectors), it is perhaps less interpretable in familiar linguistic terms, for example with the palatal place of articulation feature in a multi-valued representation instead being composed from [+high, −low, −back] feature specifications in our system. Previous work on phonological feature detection from speech [13] found similar performance between an SPE-style binary feature system like ours and multi-valued features, and [8] showed improvements for multilingual TTS training using inputs augmented with PFs of both kinds, suggesting that either formalism may be adequate for speech processing tasks.…”