Our implementation is based on the Carmel FST toolkit. 1 We create an FST for converting a sentence into a sequence of phonemes, and its inverse FST. The words to phoneme mapping is based on pronunciation dictionaries, according to the language tag of each word in the sentence.We use The CMU Pronouncing Dictionary 2 for English and a dictionary from CMUSphinx 3 for Spanish. As the phoneme inventories in the two datasets do not match, we map the Spanish phonemes to the CMU dict inventory using a manually constructed mapping. 4 To favor frequent words over infrequent ones, we add unigram probabilities to the edges of the transducer (taken from googlebooks unigrams 5 ). We filter some words that produce noise (for example, single letter words that are too frequent). When creating a monolingual sentence, we use an FST with the words of that language only. As many phoneme sequences in Spanish do not produce English alternatives (and vice versa) we allow minor changes in the phoneme sequences between the languages. Specifically, we create a small list of similar phonemes (such as "B" and "V"), 6 and generate an FST that for each phoneme allows changing it to one of its alternatives or 1 https://www.isi.edu/licensed-sw/carmel/ 2