Recent works in end-to-end control for autonomous driving have investigated the use of vision-based exteroceptive perception. Inspired by such results, we propose a new end-to-end memory-based neural architecture for robot steering and throttle control. We describe and compare this architecture with previous approaches using fundamental error metrics (MAE, MSE) and several external metrics based on their performance on simulated test circuits. The presented work demonstrates the advantages of using internal memory for better generalization capabilities of the model and allowing it to drive in a broader amount of circuits/situations. We analyze the algorithm in a wide range of environments and conclude that the proposed pipeline is robust to varying camera configurations. All the present work, including datasets, network models architectures, weights, simulator, and comparison software, is open source and easy to replicate and extend. Code: github.com/JdeRobot/DeepLearningStudio.