Chapter 1 frames the scope of this thesis, introducing the pattern recognition field and, more specifically, the MT field. It reviews the different historical approaches devised to tackle this problem. Moreover, it sets the experimental framework followed in this thesis and the main scientific objectives. Chapter 2 describes the mathematical model that represents the core of the thesis: neural networks. It addresses the parameter estimation process, describes different neural architectures and a number of techniques used along the thesis to improve the generalization capability of the model. Chapter 3 introduces the neural machine translation technology, describing the most common architectures and decoding process. Moreover, it reviews different aspects relating the NMT field that nowadays receive the attention of the research community. It also compares NMT in the different translation tasks that will be tackled in the thesis. Chapter 4 introduces the interactive-predictive pattern recognition field, that aims to minimize the effort spent by the user while supervising an automatic system. It proposes the application of this theoretical framework to the neural technology, introducing alternative interaction protocols. After that, these interactivepredictive neural systems are evaluated. Chapter 5 describes the adaptation of NMT systems via online learning techniques. After receiving a corrected sample, the system can be updated to include this new knowledge. Here are described the methods to perform this adaptation and introduces two novel alternatives. In addition, an active learning framework for neural systems is proposed, useful for a situation that requires the translation of large amounts of data. All these scenarios are thoroughly evaluated in a variety of conditions, including a user evaluation involving professional post-editors. Chapter 6 departs from the MT problem to tackle different multimodal sequenceto-sequence tasks. More precisely, it is focused on the generation textual descriptions of videos. These techniques are also applied to the captioning of daily events, captured with an egocentric camera. Finally, the interactive-predictive framework described in Chapter 4 is applied to these multimodal systems. Chapter 7 draws the main conclusions of the thesis, describing the scientific contributions and publications derived from it and traces several lines of future research. These chapters are complemented by two appendices. Appendix A describes NMT-Keras, an open-source library developed to build neural models, that has been used to carry out most of the experiments described in the thesis. In Appendix B we provide the results of a survey carried out in the scope of Chapter 5.