“…Speaker independence remains a major stumbling block [1] and improving it can be tackled in any of these three components. Given limited success of core recognition architectures in the zero resource setting, several alternative acoustic frontends and unsupervised acoustic models have been proposed in recent years [2,3,4,5,1,6,7,8,9,10], though there has been limited effort to evaluate these methods in a systematic way. Lexical discovery is the process of automatically identifying meaningful word-sized units from speech.…”