“…In the case of speech synthesis, many works included DNNs and DBNs to perform acoustic mappings and prosody prediction [2,3,4], Also, Recurrent Neural Networks (RNNs) and their variants, like the Long Short Term Memory (LSTM) architecture [5], have leveraged completely the sequences processing and prediction problem, which makes them lead to interesting results in the speech synthesis field, where an acoustic signal of variable length has to be generated out of a set of textual entities. Some example works using this structures can be seen in [6,7,8,9]. Previous to deep learning, existing text to speech technologies included the unit selection speech synthesis [10] and the statistical parametric speech synthesis (SPSS) [11].…”