“…This, in turn, consists of two parts: first, a pronunciation is hypothesized using phonemic subword units, and second, said pronunciation is converted to a spelling. Only generating a pronunciation for an unknown word is sufficient in applications such as Spoken Term Detection (STD) [6], where phonemic representations of speech are adequate for indexing and search. For transcription, however, an orthography needs to be estimated from a given phonemic subword sequence, as in [7], where phone transcriptions are converted to spellings using memory-based learning for Dutch OOV words.…”