“…Examples of source transcripts and original translations with the fluent counterparts are shown below in Table 1. SRC eh, eh, eh, um, yo pienso que es así ORG uh, uh, uh, um, i think it's like that FLT i think it's like that SRC también tengo um eh estoy tomando una clase .. ORG i also have um eh i'm taking a marketing class .. FLT i'm also taking a marketing class SRC porque qué va, mja ya te acuerda que .. ORG because what is, mhm do you recall now that .. FLT do you recall now that .. SRC y entonces am es entonces la universidad donde yo estoy es university of pennsylvania ORG and so am and so the university where i am it's the university of pennsylvania FLT i am at the university of pennsylvania 3 Speech-to-Text Model Initial work on the Fisher-Spanish dataset used HMM-GMM ASR models linked with phrasebased MT using lattices (Post et al, 2013;Kumar et al, 2014). More recently, it was shown in Weiss et al (2017) and Bansal et al (2018) that end-toend SLT models perform competitively on this task. As in Bansal et al (2018), we use a sequence-tosequence architecture inspired by Weiss et al but modified to train within available resources; specifically, all models may be trained in less than 5 days on one GPU.…”